masking from child sequences; DNSM model bugfix; Yun branch lengths #14

matsen · 2024-03-05T19:03:05Z

take the mask from the child sequence
fix bug in DNSM models which was interpreting "N" as an ambiguous AA
implementing Yun-style branch lengths
eliminating the pre-training before branch length optimization

matsen · 2024-03-05T19:07:39Z

Yun's derivation:

…rain

mmjohn

Looks great!

Running make test in my epam conda env gives the following errors on both ermine and local installs:

===================================== short test summary info ======================================
ERROR tests/test_dnsm.py::test_crepe_roundtrip - NameError: name 'training_method' is not defined
ERROR tests/test_netam.py::test_write_output - NameError: name 'training_method' is not defined
ERROR tests/test_netam.py::test_standardize_model_rates - NameError: name 'training_method' is not defined

netam/common.py

matsen · 2024-04-19T15:44:43Z

Thanks for catching the test failure!! That was a spur of the moment last decision. Fixed now.

mmjohn

Everything's working for me now

matsen · 2024-04-23T15:59:35Z

I decided to eliminate the short training before the first round of branch length optimization, which appeared here:

diff --git a/netam/framework.py b/netam/framework.py
index 42b348c..ec13b9d 100644
--- a/netam/framework.py
+++ b/netam/framework.py
@@ -656,9 +656,6 @@ class Burrito(ABC):
         else:
             raise ValueError(f"Unknown training method {training_method}")
         loss_history_l = []
-        self.mark_branch_lengths_optimized(0)
-        loss_history_l.append(self.train(3))
         optimize_branch_lengths()
         self.mark_branch_lengths_optimized(0)
         for cycle in range(cycle_count):

It was making the non-fixed SHM models harder to train, and actually an untrained model isn't so crazy scale-wise for the DNSM:

(Erick, this is here)

matsen added 7 commits February 17, 2024 03:11

mask from child sequence unless using a Crepe

f4dd953

comment

f097379

allowing for "fixed" branch length optimization

b09442b

yun approx branch lengths

aac9dbd

comments

bc91bb7

nt_mask_tensor and aa_mask_tensor are different!

42b0cee

more specific indicator tensor

068129f

matsen added 9 commits March 5, 2024 14:49

asserting that children are more masked than parents; removing full_t…

72a8fbe

…rain

adjusting bias of model to standardize

9c26db4

standardizing model rates only happens in SHM land

00ce963

testing standardize model rates

e452c1a

0.1 % is plenty good for branch length opt

db42b0c

clear stray thing

ce9183f

cleaning out todo

0ad0a3c

make format

4916834

no full default training method

7864b60

matsen requested review from ksung25 and mmjohn April 18, 2024 12:03

mmjohn reviewed Apr 18, 2024

View reviewed changes

netam/common.py Show resolved Hide resolved

heh, reinstating training method

936b23c

comment about vrc01

a6cff87

mmjohn approved these changes Apr 19, 2024

View reviewed changes

trying skipping the initial training

6810322

eliminating pre-training before BL opt

e98e69a

matsen merged commit 8f0cbd9 into main Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

masking from child sequences; DNSM model bugfix; Yun branch lengths #14

masking from child sequences; DNSM model bugfix; Yun branch lengths #14

matsen commented Mar 5, 2024 •

edited

Loading

matsen commented Mar 5, 2024

mmjohn left a comment •

edited

Loading

matsen commented Apr 19, 2024

mmjohn left a comment

matsen commented Apr 23, 2024

masking from child sequences; DNSM model bugfix; Yun branch lengths #14

masking from child sequences; DNSM model bugfix; Yun branch lengths #14

Conversation

matsen commented Mar 5, 2024 • edited Loading

matsen commented Mar 5, 2024

mmjohn left a comment • edited Loading

Choose a reason for hiding this comment

matsen commented Apr 19, 2024

mmjohn left a comment

Choose a reason for hiding this comment

matsen commented Apr 23, 2024

matsen commented Mar 5, 2024 •

edited

Loading

mmjohn left a comment •

edited

Loading