Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

automatic official FASTA file fetching, several new utility functions related to structure including flexible 3d alignment that supports different length chains to be aligned! #101

Merged
merged 147 commits into from
May 26, 2024

Conversation

YoelShoshan
Copy link
Collaborator

No description provided.

YoelShoshan and others added 25 commits January 10, 2024 06:43
@YoelShoshan YoelShoshan requested a review from mosheraboh May 5, 2024 06:08
Copy link
Collaborator

@mosheraboh mosheraboh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.
Minors inline.

from typing import Optional


def main(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is this script for? please explain in the comments section.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added this desc in the docstring:

"Takes an input PDB files and splits it into separate files, one per describe chain, allowing to rename the chains if desired"

mask=None, # TODO: check
)

# apply_on_atom_pos = apply_rigid_on_dynamic_concat['atom_positions']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the commented out code below?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

@@ -22,18 +22,26 @@ def __init__(self, verbose: bool = True) -> None:
self.chains_data = {} # maps from chain description (e.g. ('7vux', 'A')) to
self.flattened_data = {}

#
self.per_chain_most_frequent_residue_part = {}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The key is chain_id?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a key is a tuple in the format (pdb_id, chain_id)

added a comment above it with description

def add(
self,
pdb_id: str,
pdb_id_or_filename: str,
pdb_id: Optional[str] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you keep pdb_id? Is backward compatibility important here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdb_id_or_filename might be a path to a local pdb file, so pdb_id gives the user an option to define the pdb_id as there is no implemented way to automatically extract it.

I'll check ways to extract it automatically from the pdb file itself, and check if it's important to allow the user to override that.

That will be a separate PR.

about what you asked - backward compatibility isn't super critical here, as we can likely detect all usage locations and port them to newer usage.

@@ -174,7 +176,7 @@ def load_protein_structure_features(
chain_id_type: str = "author_assigned",
device: str = "cpu",
max_allowed_file_size_mbs: float = None,
also_return_mmcif_object: bool = False,
# also_return_mmcif_object: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted

):
if not m_res:
continue
aa_idx = aa_idx.item()
p_res = p_res.clone().detach().cpu() # fixme: this looks slow
if aa_idx == 21:
# if torch.is_tensor(p_res):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete?

@YoelShoshan YoelShoshan merged commit 4a1208a into main May 26, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants