-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Baseline Implementation (VariPred) Model #5
Comments
VariPred is one specific solution of fine-tuning, which for a given protein sequence:
I would say, a simpler baseline would be to not train it but rather use some distance metric between wildtype and mutation embedding vectors to see how it correlates with the target (pathogenicity). Similarly as studied in Nucleotide Transformer paper (Fig. 4). The best performing out of those distance metrics would be our own baseline and the starting point of setting up the pipeline. Then, we can try fine-tuning as VariPred or possibly other strategies to improve upon it. |
Do you know the source of this model? |
Nucleotide Transformer |
I am actually not sure, would their model be useful for us? Because it's for DNA sequences but we do proteins, right? |
No description provided.
The text was updated successfully, but these errors were encountered: