Hyperparam opt, more models, more flexible training #2

matsen · 2023-11-28T11:53:46Z

No description provided.

matsen · 2023-12-12T22:00:42Z

For the DNSM, I've decided to drop the padding mask because now we're looking at masking for all of the sequence with Ns. But here is the old version:

    def forward(self, parent_onehots: Tensor, padding_mask: Tensor) -> Tensor:
        """Build a binary log selection matrix from a one-hot encoded parent sequence.

        Because we're predicting log of the selection factor, we don't use an
        activation function after the transformer.

        Parameters:
            parent_onehots: A tensor of shape (B, L, 20) representing the one-hot encoding of parent sequences.
            padding_mask: A tensor of shape (B, L) representing the padding mask for the sequence.

        Returns:
            A tensor of shape (B, L, 1) representing the log level of selection
            for each amino acid site.
        """

        # Multiply by sqrt(d_model) to match the transformer paper.
        parent_onehots = parent_onehots * math.sqrt(self.d_model)
        # Have to do the permutation because the positional encoding expects the
        # sequence length to be the first dimension.
        parent_onehots = self.pos_encoder(parent_onehots.permute(1, 0, 2)).permute(
            1, 0, 2
        )

        # NOTE: not masking due to MPS bug
        out = self.encoder(parent_onehots)  # , src_key_padding_mask=padding_mask)
        out = self.linear(out)
        out = F.logsigmoid(out)
        return out.squeeze(-1)

    def selection_factors_of_aa_str(self, aa_str: str):
        """Do the forward method without gradients from an amino acid string and convert to numpy.

        Parameters:
            aa_str: A string of amino acids.

        Returns:
            A numpy array of the same length as the input string representing
            the level of selection for each amino acid site.
        """
        aa_onehot = sequences.aa_onehot_tensor_of_str(aa_str)

        model_device = next(self.parameters()).device
        # Create a padding mask with False values (i.e., no padding)
        padding_mask = torch.zeros(len(aa_str), dtype=torch.bool).to(model_device)

        with torch.no_grad():
            aa_onehot = aa_onehot.to(model_device)
            model_out = self(aa_onehot.unsqueeze(0), padding_mask.unsqueeze(0)).squeeze(
                0
            )
            final_out = torch.exp(model_out)

        return final_out[: len(aa_str)]

matsen added 30 commits November 18, 2023 05:28

site rate penalization

b8705dd

running regularization

8fafbe7

regularization reshmoof

079d283

basic cnn model

36d64a1

new file: notebooks/data-description.ipynb

f550ed3

before bringing in optuna

a6cee12

refactoring to clean up train

0eed134

fivemer regularization

58398f7

-

4fed747

commit before making optuna generic

8d57e58

HyperBurrito

76cc735

nice negative diff

1f95bb5

float params

d972c88

adding dropout to cnn

b0a11dd

dropout really

4c7f53e

better data description

3618587

more mutations in 13

794b96a

model comparison, w/ before and after training

3808f28

some fun with trend filtering site rates

c43c2c6

general framework improvements

ba6220a

adding CNNmodel

53a7ad0

nicknames with smaller counts

eff019b

various quality of life improvements

1b8eb7f

fivemer nb

cb6aa4a

various improvements

2723220

two stage fitting of reshmoof

01a1ad3

some good looking models!

3499be3

model comparison with small models

cc1a05d

cleanup

a13e51a

readme ideas

abcb1ca

matsen added 3 commits December 12, 2023 11:01

Fivemer should be a 5-mer

835337b

simplifying device usage

3614b59

note

b387dea

matsen added 25 commits December 13, 2023 05:39

masked training is working

2c1e165

one more

2c71ebf

no need to specify mask

eca488a

mask fix

1a55882

to cpu

8b06601

more dvice

a5b5337

more device

360a334

-

157a681

train_and_optimize_branch_lengths

38dcec9

generalizing hyperburrito, cleanup

aca0366

making hyper possible for DNSM

666efb4

everything to device

e770bc1

BL opt definitely on CPU

85abb4d

print_tensor_devices

e447ffe

Making rates and subs probs on CPU

b293f94

more CPU before molevol

00abaa3

sql db for optuuna

c136d8b

moving to AA embedding rather than onehot

0a691b9

docs

58e4c7c

assert probability not clamp probability

f0d865c

dropping probability clamping in log_pcp_probability

917edfe

assert probability

0ccf677

moving out HyperBurrito

971f96f

things

c12de49

make format

cbf6a01

matsen merged commit 6fe79dc into main Dec 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hyperparam opt, more models, more flexible training #2

Hyperparam opt, more models, more flexible training #2

matsen commented Nov 28, 2023

matsen commented Dec 12, 2023

Hyperparam opt, more models, more flexible training #2

Hyperparam opt, more models, more flexible training #2

Conversation

matsen commented Nov 28, 2023

matsen commented Dec 12, 2023