Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evo2 #694

Open
wants to merge 149 commits into
base: main
Choose a base branch
from
Open

Evo2 #694

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
149 commits
Select commit Hold shift + click to select a range
50db0ca
[cye/evo2-llm-dev] Private internal development branch for Evo2 in Bi…
cspades Nov 16, 2024
737f16c
[cye/evo2-llm-dev] Add rough draft of data preprocessing for Evo2.
cspades Dec 4, 2024
a142109
Add manual data test for evo2
jstjohn Dec 4, 2024
0ad0bee
Change remotes for submodules for now
jstjohn Dec 5, 2024
82c832f
Cye/nemo2 fixes
cspades Dec 5, 2024
945506f
Write model checkpoint context and set Evo2Dataset in the pre-training.
cspades Dec 10, 2024
4fc1d84
Fix inference script to make sense, i.e. no seq parallelism for decod…
cspades Dec 11, 2024
f5adde5
Cye/fix Hyena species biases
cspades Dec 16, 2024
b9dfd5c
Hyena golden value test
jstjohn Dec 19, 2024
e6278d9
[cye/blended-training] Expose blended weights for training Hyena.
cspades Dec 21, 2024
dd0aab1
Changes for 256 node training run
jstjohn Dec 23, 2024
0560ee4
Integrate BioNeMo Noodles into Hyena data preprocessing.
cspades Dec 24, 2024
5511fe7
[cye/lineage-str] Clean up interface for taxonomic lineage tokens in …
cspades Jan 3, 2025
92d0352
Changes made on 256 node branch
jstjohn Jan 3, 2025
923cbdf
Cye/hyena flops
cspades Jan 3, 2025
6460ea3
Fix broken import of blended training config.
cspades Jan 3, 2025
7e72f48
Cye/import fix
cspades Jan 3, 2025
45923c6
Add improved nsys profiling support
jstjohn Jan 6, 2025
c805984
[cye/hyena-doc-update] Add data preprocessing documentation, fix tech…
cspades Jan 7, 2025
f5b15f3
[cye/transcript-readme] Add main documentation snippets for Hyena, an…
cspades Jan 8, 2025
9ba9e07
Bump nemo version to the new context length insensitive code, and upd…
jstjohn Jan 10, 2025
854951f
added flag for tflops callback
dorotat-nv Jan 13, 2025
ada349e
[cye/evo2-ckpt-utils] Add Evo2 ZeRO-1/3 to NeMo checkpointing utils.
cspades Jan 13, 2025
652dfe0
Add test for evo2 tokenizer.
jwilber Jan 14, 2025
265a0be
Fix nemo-savanna repo build in CI
dorotat-nv Jan 14, 2025
fb09377
fixing format issues on evo2-dev
dorotat-nv Jan 14, 2025
9cacf1b
Add tests for parallel hyena operators used in evo2
jwilber Jan 14, 2025
9ac11eb
Rebase on OSS.
cspades Jan 14, 2025
5631b93
[cye/tp-comm-fix] Fix TP communication overlap inconsistency.
cspades Jan 15, 2025
9ae9af0
Add temporary fix for shard-tensor bug in Megatron-LM
dorotat-nv Jan 16, 2025
c032408
Add initial test for preprocess.py
jwilber Jan 17, 2025
b6d238f
Bump NeMo to pick up FLOPS calculations.
cspades Jan 17, 2025
7822c04
[cye/z3-log-fix] Fix parameter count log.
cspades Jan 21, 2025
9378223
[cye/docker-patch-fix] Move Megatron patch to BioNeMo base image in D…
cspades Jan 22, 2025
9b9176a
shipping hotfix for dockers built locally - fix from main 17c6b20513…
dorotat-nv Jan 24, 2025
329548a
[cye/1m-ckpt-config] Add HyenaConfig options for 1M context length di…
cspades Jan 24, 2025
2ca40b0
[cye/fix-tp-comm-overlap] Fix default tp_comm_overlap=True being used…
cspades Jan 24, 2025
a494478
reducing scope of tested folders for evo2-dev
dorotat-nv Jan 27, 2025
72a311e
Adds basic inference test
jomitchellnv Jan 28, 2025
d4cd785
[cye/deactivate-infer-tpcomm] Deactivate TP communication during infe…
cspades Jan 28, 2025
3ba8946
fix: ensure test looks in test file dir for required data
jwilber Jan 28, 2025
34938fc
m2.5 accuracy 7b runs
jstjohn Jan 29, 2025
5fe2576
Fixes `test_evo2.py` unit test and adds enhancements to existing unit…
jomitchellnv Jan 30, 2025
30e71e9
Fix bug in wandb logger argparse.
jstjohn Jan 30, 2025
635a5df
[cye/pad-loss-mask] Fixes TP comm overlap bug with sequence parallel …
cspades Feb 3, 2025
f141830
Add longphase dataset config to repo
jwilber Feb 3, 2025
493d444
bump Megatron-LM, nemo-savanna and rebase to main OSS
dorotat-nv Feb 5, 2025
1df0176
CI hotfix
dorotat-nv Feb 7, 2025
624e797
test: Create tests for Evo2Dataset mask_phylogenetic_tags
jwilber Feb 7, 2025
75205b0
[cye/torch_dist_fix] Remove torch_dist patch and bump Megatron, reorg…
cspades Feb 11, 2025
0f6efeb
Changes related to accuracy and perf with new nemo2 changes
jstjohn Feb 11, 2025
0af5f9e
[cye/tp-comm-fp8-wgrad-fix]Require --fp8-wgrad when using TP communic…
Feb 12, 2025
0917616
Adding evo2 to JET
dorotat-nv Feb 13, 2025
3d1e19e
Remove sample data from evo2-dev branch
dorotat-nv Feb 13, 2025
9811ae4
[BUGFIX] evo2-dev CI
dorotat-nv Feb 13, 2025
f9133f5
Remove test_mask_phylogenetic tags (moving to nemo repo)
jwilber Feb 14, 2025
e83bd26
attempt at merge -- nemo matches github main. Dockerfile has major bu…
skothenhill-nv Feb 14, 2025
a175d5b
Bump nemo to fix forward bug
jstjohn Feb 14, 2025
ea70cde
Add required changes to work with NeMo upstream
jstjohn Feb 14, 2025
dddf9a4
Add back new context manager for parallel state cleanup
jstjohn Feb 14, 2025
e192982
Move test_config into nemo where the code is
jstjohn Feb 14, 2025
d355729
Fix arg name mismatch
jstjohn Feb 15, 2025
d90c10d
add new license
jwilber Feb 15, 2025
a8432a2
remove tab from license
jwilber Feb 15, 2025
4f2ade5
Bump nemo to fix bug in dataset
jstjohn Feb 15, 2025
3965502
Bump NeMo commit for perf improved loss mask
jstjohn Feb 18, 2025
f09aa36
Adding options for controlling dropout to train.py
jstjohn Feb 18, 2025
955978d
Bump nemo and remove nograd decorator
jstjohn Feb 18, 2025
3e14262
Bump nemo with latest tag masking
jstjohn Feb 18, 2025
46baa5f
Cover non-DNA case due to bug in preprocessing, never have non-dna un…
jstjohn Feb 18, 2025
ef3f55e
Try reverting some of the recent fixes related to TP
jstjohn Feb 18, 2025
af9016e
Bump nemo version with better tested
jstjohn Feb 18, 2025
aafb7a3
Revert loss mask updates
jstjohn Feb 18, 2025
a966b8b
handle 0 token case more gracefully
jstjohn Feb 18, 2025
c4ef1f1
bump NeMo with proper handling of control character containing sequen…
jstjohn Feb 19, 2025
0976fac
Update remote pointers to new public NeMo branches
jstjohn Feb 19, 2025
04982ae
Remove unused Megatron torch_dist sizing patch.
cspades Feb 19, 2025
242f3fe
Remove fasta from test and replace with synthetic sequence
jstjohn Feb 19, 2025
22ada77
Move fasta creation utility into testing sub-package
jstjohn Feb 19, 2025
b5bdec8
Add a test that verifies that the new phylo tag masking code is faste…
jstjohn Feb 19, 2025
ac1bd1f
Move phylo tag benchmark to NeMo testing
jstjohn Feb 19, 2025
bfaebd1
Merge in main
jstjohn Feb 20, 2025
0ae0c50
Update Megatron-LM submodule to commit 62529f1d (has 1M context fix) …
jwilber Feb 20, 2025
2ba5da3
fix config typo in test
jstjohn Feb 21, 2025
253a7f2
bump NeMo to latest PR version
jstjohn Feb 21, 2025
82e9c47
Fix issue causing gh-docs-deploy failure (#698)
jwilber Feb 21, 2025
fa73a00
Update nemo pointer with PR updates
jstjohn Feb 21, 2025
f466774
Add new license to new files (failing ci) (#699)
jwilber Feb 21, 2025
39290e4
Change kingdom to domain in tag description
jstjohn Feb 21, 2025
b688975
Merge in upstream
jstjohn Feb 21, 2025
94c4283
Merge branch 'main' of github.com:NVIDIA/bionemo-framework into evo2
jstjohn Feb 21, 2025
15c7dca
Make new versions of the files available freshly converted from HF
jstjohn Feb 22, 2025
3324bd4
bump nemo version to fix broken import
jstjohn Feb 22, 2025
ce133d2
bump nemo to top of tree
jstjohn Feb 22, 2025
78f92b5
Adding in the predict method and test
jstjohn Feb 24, 2025
bb5f5a1
Merge branch 'main' of github.com:NVIDIA/bionemo-framework into evo2
jstjohn Feb 24, 2025
d9e4952
bump NeMo commit
jstjohn Feb 24, 2025
b148750
Fix multipart download naming in nemo
jstjohn Feb 24, 2025
ba1d9bf
Update docs for checkpoint conversion
jstjohn Feb 24, 2025
0af3e0a
shrink tests down to 1b case
jstjohn Feb 24, 2025
c5e42d8
add end to end fine-tuning tutorial
jstjohn Feb 25, 2025
544b7a8
ignore object hashes in precommit
jstjohn Feb 25, 2025
d7a8ea7
Bump nemo pointer to latest PR pointer
jstjohn Feb 25, 2025
07c48b8
Update ci/benchmarks/partial-conv/evo2_pretrain.yaml
jstjohn Feb 25, 2025
e779f60
Update ci/benchmarks/perf/evo2_pretrain.yaml
jstjohn Feb 25, 2025
a1c8048
Slightly smaller test_train.py
jstjohn Feb 25, 2025
46edcb6
Add missing main function for inference cli
jstjohn Feb 25, 2025
e81eef3
Add --batch-size option to predict
jstjohn Feb 25, 2025
4e5acda
Fixing the description of the 1b model
jstjohn Feb 25, 2025
5bd0e2c
remove hard-coded PBSS
jstjohn Feb 26, 2025
ca16c2a
Remove comment block from code
jstjohn Feb 26, 2025
5248e5d
evo2 train unit test (#704)
dorotat-nv Feb 27, 2025
1e7323b
Updates to benchmarks: evo2 (#705)
dorotat-nv Feb 28, 2025
24f1db0
Add brca1 zeroshot example + predict and scoring updates to evo2.
jwilber Mar 4, 2025
e012146
Add vortex style fp8 support to predict
jstjohn Mar 4, 2025
ec662e4
Update the brca notebook with a run on an fp8 supporting machine
jstjohn Mar 4, 2025
aabd6a4
Merge in upstream changes to bionemo
jstjohn Mar 4, 2025
66ead75
add missing/new NGC urls
jstjohn Mar 4, 2025
0c67976
Remove fasta from pre commit
jstjohn Mar 4, 2025
ae81e4d
Remove TODOs related to PBSS
jstjohn Mar 4, 2025
b15cc82
Moved test config into the tests/config dir with the other configs
jstjohn Mar 4, 2025
a752309
Address yaml location feedback
jstjohn Mar 4, 2025
b1bb99d
Add new test covering padding and seq dims
jstjohn Mar 4, 2025
4f67795
Address comments on documentation
jstjohn Mar 4, 2025
97d3845
Run pre-commit on docs
jstjohn Mar 4, 2025
7473e14
Address PR feedback on test naming
jstjohn Mar 4, 2025
6be9801
Refactor out fasta dataset, add tests for it (#716)
jwilber Mar 4, 2025
0e13ad0
Bump nemo commit with predict changes
jstjohn Mar 4, 2025
bf08649
no longer needed since we do not have committed fastas
jstjohn Mar 4, 2025
04914d2
Reformat to pass pre-commit
jstjohn Mar 4, 2025
48cab0a
update readme to mention predict (#717)
jwilber Mar 4, 2025
1d19941
Fix parallel short hyena operator test
jwilber Mar 4, 2025
14ef1ea
Add slow tests for 7b
jstjohn Mar 4, 2025
4f83438
Update faster 1b test with lower precision so it passes in CI
jstjohn Mar 4, 2025
f235b09
Merge branch 'evo2' of github.com:NVIDIA/bionemo-framework into evo2
jstjohn Mar 4, 2025
25fefc8
Address formatting issues
jstjohn Mar 4, 2025
e34c44d
Leave megatron-lm as is and add more stringent slow test along with l…
jstjohn Mar 4, 2025
51a8c7a
Bump nemo as well
jstjohn Mar 4, 2025
19b2289
Merge branch 'main' of github.com:NVIDIA/bionemo-framework into evo2
jstjohn Mar 4, 2025
3204869
Update pointer to evo2 test file
jstjohn Mar 5, 2025
70b266c
only run most stringent comparison with h100
jstjohn Mar 5, 2025
9c7e3f7
add missing ngc link for new 7b-8k checkpoint
jstjohn Mar 5, 2025
01f8f05
Fixing sahpe issue in parallel short hyena test
jstjohn Mar 5, 2025
88f2c48
Address issue with pycache when there are tests with the same name in…
jstjohn Mar 5, 2025
825879d
Move to per-package tests for slow as well as fast tests
jstjohn Mar 5, 2025
66d99f8
Handle no tests found case
jstjohn Mar 5, 2025
a6b2a4a
Add option for allowing no slow tests for a submodule
jstjohn Mar 5, 2025
70bfee4
Handle exit code capturing within the context of a pipefail script pr…
jstjohn Mar 5, 2025
46fee2c
Merge branch 'main' into evo2
jstjohn Mar 5, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion .github/workflows/unit-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,9 @@ jobs:
(github.event_name == 'pull_request' && contains(github.event.pull_request.labels.*.name, 'INCLUDE_SLOW_TESTS'))
env:
BIONEMO_DATA_SOURCE: ngc
run: pytest -v -m "slow" sub-packages/
# Not every sub-package has slow tests, and since some sub-packages have tests under the same name we need
# to run package by package like we do with the fast tests.
run: ./ci/scripts/run_pytest.sh --no-nbval --only-slow --allow-no-tests

- name: Run notebook tests
if: |
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ repos:
exclude: package.lock.json
- id: detect-secrets
name: detect-secrets (notebooks only)
args: ['--baseline', '.secrets-nb.baseline', '--exclude-files', '^.(?!.*\.ipynb)', '--exclude-lines', '"(hash|id|image/\w+)":.*', ]
args: ['--baseline', '.secrets-nb.baseline', '--exclude-files', '^.(?!.*\.ipynb)', '--exclude-lines', '"(hash|id|image/\w+)":.*|<.*at 0x[0-9a-f]+>|object at 0x[0-9a-f]+', ]
- repo: local
hooks:
- id: license-header-check
Expand Down
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,9 @@
"filename": "pyproject.toml",
"hashed_secret": "79670e9c9d1c7ea5b81a96a2053d81437712c78e",
"is_verified": false,
"line_number": 44
"line_number": 45
}
]
},
"generated_at": "2025-01-15T19:06:19Z"
"generated_at": "2025-01-30T14:18:42Z"
}
2 changes: 1 addition & 1 deletion 3rdparty/Megatron-LM
Submodule Megatron-LM updated 41 files
+28 −4 .gitlab/stages/00.pre.yml
+104 −27 .gitlab/stages/01.test.yml
+7 −1 .gitlab/stages/02.functional-tests.yml
+18 −4 .gitlab/stages/03.publish.yml
+11 −1 CHANGELOG.md
+10 −3 Dockerfile.ci.dev
+5 −2 Dockerfile.ci.lts
+18 −7 examples/multimodal/dataset_helpers.py
+43 −23 examples/multimodal/image_processing.py
+6 −0 examples/multimodal/multimodal_args.py
+9 −7 megatron/core/datasets/blended_megatron_dataset_builder.py
+356 −41 megatron/core/dist_checkpointing/strategies/async_utils.py
+1 −0 megatron/core/dist_checkpointing/strategies/base.py
+52 −33 megatron/core/dist_checkpointing/strategies/filesystem_async.py
+2 −2 megatron/core/dist_checkpointing/strategies/torch.py
+4 −2 megatron/core/model_parallel_config.py
+5 −3 megatron/core/optimizer/optimizer.py
+7 −2 megatron/core/parallel_state.py
+6 −2 megatron/training/arguments.py
+19 −5 megatron/training/async_utils.py
+1 −1 megatron/training/checkpointing.py
+78 −64 megatron/training/initialize.py
+15 −1 megatron/training/tokenizer/tokenizer.py
+8 −8 megatron/training/training.py
+0 −1 requirements/pytorch_24.10/requirements.txt
+1 −0 ...cases/gpt/gpt3_mr_mcore_te_tp1_pp4_persistent_ckpt_disable_bias_linear_dgx_a100_1N8G/golden_values_dev.json
+1 −0 ...cases/gpt/gpt3_mr_mcore_te_tp1_pp4_persistent_ckpt_disable_bias_linear_dgx_a100_1N8G/golden_values_lts.json
+54 −0 ...test_cases/gpt/gpt3_mr_mcore_te_tp1_pp4_persistent_ckpt_disable_bias_linear_dgx_a100_1N8G/model_config.yaml
+1 −0 ...3_mr_mcore_te_tp1_pp4_resume_torch_dist_persistent_disable_bias_linear_dgx_a100_1N8G/golden_values_dev.json
+1 −0 ...3_mr_mcore_te_tp1_pp4_resume_torch_dist_persistent_disable_bias_linear_dgx_a100_1N8G/golden_values_lts.json
+54 −0 ...t/gpt3_mr_mcore_te_tp1_pp4_resume_torch_dist_persistent_disable_bias_linear_dgx_a100_1N8G/model_config.yaml
+41 −34 tests/test_utils/python_scripts/launch_jet_workload.py
+12 −0 tests/test_utils/recipes/gpt.yaml
+4 −2 tests/unit_tests/dist_checkpointing/test_async_save.py
+3 −3 tests/unit_tests/dist_checkpointing/test_global_metadata_reuse.py
+8 −3 tests/unit_tests/dist_checkpointing/test_local.py
+3 −3 tests/unit_tests/dist_checkpointing/test_nonpersistent.py
+2 −2 tests/unit_tests/dist_checkpointing/test_optimizer.py
+3 −2 tests/unit_tests/dist_checkpointing/test_replication.py
+3 −4 tests/unit_tests/dist_checkpointing/utils.py
+50 −1 tests/unit_tests/test_optimizer.py
2 changes: 1 addition & 1 deletion 3rdparty/NeMo
Submodule NeMo updated 147 files
63 changes: 63 additions & 0 deletions ci/benchmarks/partial-conv/evo2_pretrain.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
scope: partial-conv
time_limit: 14400
script_args:
# All arguments referenced in the script string must be specified here.
# Arguments not referenced in the script string must have the 'arg' field specified.
# See jet/core/configs.py for the specification of the configuration class
workspace:
value: /workspace/bionemo2
key_segment: False
data_path:
value: /data/evo2
key_segment: False
model:
value: evo2
variant:
value: train
config_name:
value: 7b
precision:
value: fp8
nodes:
value: 4
gpus:
value: 8
batch_size:
value: 2
pp:
value: 1
tp:
value: 8
cp:
value: 1
acc_grad:
value: 1
max_steps:
value: 20000
script: |-
WANDB_API_KEY=$BIONEMO_WANDB_API_KEY python ${workspace}/sub-packages/bionemo-evo2/src/bionemo/evo2/run/train.py \
-d ${workspace}/sub-packages/bionemo-evo2/tests/config/test_dataset_config.yaml \
--dataset-dir ${data_path} \
--grad-acc-batches ${acc_grad} \
--fp8 \
--enable-preemption \
--ckpt-async-save \
--seq-length=8192 \
--tensor-parallel-size=${tp} \
--context-parallel-size=${cp} \
--pipeline-model-parallel-size=${pp} \
--workers 8 \
--num-nodes=${nodes} \
--devices=${gpus} \
--micro-batch-size=${batch_size} \
--model-size=${config_name} \
--max-steps=${max_steps} \
--limit-val-batches=20 \
--log-every-n-steps=50 \
--val-check-interval=500 \
--tflops-callback \
--experiment-dir=${tensorboard_dir}/${batch_size}bs_${nodes}node_${gpus}gpu_${max_steps}s_${precision}prec \
--wandb-project=${wandb_project_name} \
--wandb-group=${model}_${variant}_${config_name}__${target} \
--wandb-job-type=${pipeline_label} \
--disable-checkpointing;
67 changes: 67 additions & 0 deletions ci/benchmarks/perf/evo2_pretrain.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
scope: perf
time_limit: 1800
script_args:
# All arguments referenced in the script string must be specified here.
# Arguments not referenced in the script string must have the 'arg' field specified.
# See jet/core/configs.py for the specification of the configuration class
workspace:
value: /workspace/bionemo2
key_segment: False
data_path:
value: /data/evo2
key_segment: False
model:
value: evo2
variant:
value: train
precision:
value: fp8
gpus:
value: 8
batch_size:
value: 2
max_steps:
value: 100
tp:
value: 8
cp:
value: 1
pp:
value: 1
acc_grad:
value: 1
products:
- nodes: 1
config_name: 7b
- nodes: 2
config_name: 7b
- nodes: 8
config_name: 40b
script: |-
WANDB_API_KEY=$BIONEMO_WANDB_API_KEY python ${workspace}/sub-packages/bionemo-evo2/src/bionemo/evo2/run/${variant}.py \
-d ${workspace}/sub-packages/bionemo-evo2/tests/config/test_dataset_config.yaml \
--dataset-dir ${data_path} \
--grad-acc-batches ${acc_grad} \
--fp8 \
--enable-preemption \
--ckpt-async-save \
--use-megatron-comm-overlap-llama3-8k \
--seq-length=8192 \
--tensor-parallel-size=${tp} \
--context-parallel-size=${cp} \
--pipeline-model-parallel-size=${pp} \
--workers 8 \
--num-nodes=${nodes} \
--devices=${gpus} \
--micro-batch-size=${batch_size} \
--model-size=${config_name} \
--max-steps=${max_steps} \
--limit-val-batches=20 \
--log-every-n-steps=50 \
--val-check-interval=${max_steps} \
--tflops-callback \
--experiment-dir=${tensorboard_dir}/${batch_size}bs_${nodes}node_${gpus}gpu_${max_steps}s_${precision}prec \
--wandb-project=${wandb_project_name} \
--wandb-group=${model}_${variant}_${config_name}__${target} \
--wandb-job-type=${pipeline_label} \
--disable-checkpointing;
32 changes: 29 additions & 3 deletions ci/scripts/run_pytest.sh
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,8 @@ Options:
--skip-docs Skip running tests in the docs directory
--no-nbval Skip jupyter notebook validation tests
--skip-slow Skip tests marked as slow (@pytest.mark.slow)
--only-slow Only run tests marked as slow (@pytest.mark.slow)
--allow-no-tests Allow sub-packages with no found tests (for example no slow tests if --only-slow is set)

Note: Documentation tests (docs/) are only run when notebook validation
is enabled (--no-nbval not set) and docs are not skipped
Expand All @@ -50,6 +52,8 @@ declare -a coverage_files
SKIP_DOCS=false
NO_NBVAL=false
SKIP_SLOW=false
ONLY_SLOW=false
ALLOW_NO_TESTS=false
error=false

# Parse command line arguments
Expand All @@ -58,6 +62,8 @@ while (( $# > 0 )); do
--skip-docs) SKIP_DOCS=true ;;
--no-nbval) NO_NBVAL=true ;;
--skip-slow) SKIP_SLOW=true ;;
--only-slow) ONLY_SLOW=true ;;
--allow-no-tests) ALLOW_NO_TESTS=true ;;
-h|--help) usage ;;
*) echo "Unknown option: $1" >&2; usage 1 ;;
esac
Expand Down Expand Up @@ -85,6 +91,7 @@ PYTEST_OPTIONS=(
)
[[ "$NO_NBVAL" != true ]] && PYTEST_OPTIONS+=(--nbval-lax)
[[ "$SKIP_SLOW" == true ]] && PYTEST_OPTIONS+=(-m "not slow")
[[ "$ONLY_SLOW" == true ]] && PYTEST_OPTIONS+=(-m "slow")

# Define test directories
TEST_DIRS=(./sub-packages/bionemo-*/)
Expand All @@ -94,13 +101,32 @@ fi

echo "Test directories: ${TEST_DIRS[*]}"

clean_pycache() {
# Use the provided base directory or default to current directory
local base_dir="${1:-.}"
echo "Cleaning Python cache files in $base_dir..."
find "$base_dir" -regex '^.*\(__pycache__\|\.py[co]\)$' -delete
}

# Run tests with coverage
for dir in "${TEST_DIRS[@]}"; do
echo "Running pytest in $dir"

if ! pytest "${PYTEST_OPTIONS[@]}" --junitxml=$(basename $dir).junit.xml -o junit_family=legacy "$dir"; then
error=true
# Run pytest but don't exit on failure - we'll handle the exit code separately. This is needed because our script is
# running in pipefail mode and pytest will exit with a non-zero exit code if it finds no tests.
{ pytest "${PYTEST_OPTIONS[@]}" --junitxml=$(basename $dir).junit.xml -o junit_family=legacy "$dir"; exit_code=$?; } || true

if [[ $exit_code -ne 0 ]]; then
if [[ "$ALLOW_NO_TESTS" == true && $exit_code -eq 5 ]]; then
# Exit code 5 means no tests found, which is allowed if --allow-no-tests is set
echo "No tests found in $dir (exit code $exit_code) - continuing as --allow-no-tests is set"
else
echo "Error: pytest failed with exit code $exit_code"
error=true
fi
fi

# Avoid duplicated pytest cache filenames.
clean_pycache "$dir"
done

# Exit with appropriate status
Expand Down
5 changes: 3 additions & 2 deletions ci/scripts/utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,11 @@ check_git_repository() {
if ! git diff-index --quiet HEAD --; then
if [ $? -eq 128 ]; then
echo "ERROR: Not in a git repository!" >&2
return 1
else
echo "ERROR: Repository is dirty! Commit all changes before building the image!" >&2
echo "Warning: Repository is dirty! Commit all changes before building the image!" >&2
return 0
fi
return 1
fi
}

Expand Down
14 changes: 14 additions & 0 deletions docs/docs/user-guide/examples/bionemo-evo2/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# ignore temp files made by this tutorial
# chromosome files
*.fa
*.fa.gz

# config files
*.yaml

# directories created during these notebook runs.
nemo2_evo2_1b_8k/
preprocessed_data/
pretraining_demo/
brca1_fasta_files/
brca1/
Loading