Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evo2 #694

Open
wants to merge 143 commits into
base: main
Choose a base branch
from
Open

Evo2 #694

wants to merge 143 commits into from

Conversation

jstjohn
Copy link
Collaborator

@jstjohn jstjohn commented Feb 19, 2025

Description

This provides an implementation of Evo2 supporting pre-training, fine-tuning and preprocessing of data for Evo2 from fasta files. This makes use of the new Hyena/Evo2 model support in NVIDIA/NeMo#12263.

Known issues

  • FP8 settings are not exact matches for Savanna so fine-tuning with/without FP8 works well with some checkpoints but not others.

Type of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Refactor
  • Documentation update
  • Other (please describe):

Pre-submit Checklist

  • I have tested these changes locally
  • I have updated the documentation accordingly
  • I have added/updated tests as needed
  • All existing tests pass successfully

cspades and others added 30 commits November 16, 2024 10:46
… debt in tokenizer and config, remove unused args in infer.py.
…d add transcript splicing script for preprocessing.
jstjohn and others added 5 commits March 4, 2025 19:08
Refactor out Fasta dataset class to its own file and add tests.

Signed-off-by: Jared Wilber <[email protected]>
Signed-off-by: John St John <[email protected]>
@jstjohn jstjohn enabled auto-merge March 4, 2025 22:21
jwilber and others added 6 commits March 4, 2025 14:46
Update README to mention predict functionality.

I currently link to the not-yet-built docs - not sure if we want to
change that or not

---------

Signed-off-by: Jared Wilber <[email protected]>
Signed-off-by: John St John <[email protected]>
Signed-off-by: John St John <[email protected]>
@trvachov
Copy link
Collaborator

trvachov commented Mar 5, 2025

👀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
INCLUDE_NOTEBOOKS_TESTS Add Jupyter notebook validation to the CI pipeline INCLUDE_SLOW_TESTS Add unit tests marked as slow to CI pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants