Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading and writing intra-residue bonds in PDBx files #567

Merged
merged 4 commits into from
May 29, 2024

Conversation

padix-key
Copy link
Member

This PR add support for reading/writing intra-residue bonds via the chem_comp_bond category in PDBx files. Files that contain this category were introduced in the PDB NextGen Archive.

Copy link

@cisert cisert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a pretty common use case might be reading SDF files generates from some cheminformatics software (e.g., RDKit), and then wanting to have the same bond information in cif format.
I've tried reading an RDKit-generated SDF file (attached below, rename to aspirin.sdf), saving it as PDBx (incl. bonds), then reading it again from cif, which didn't seem to keep the bond information:

import biotite.structure.io as strucio
import biotite.structure.io.pdbx as pdbx

arr = strucio.load_structure("aspirin.sdf")
print(arr.bonds.as_array().shape) # (13, 3)

f = pdbx.CIFFile()
pdbx.set_structure(f, arr, include_bonds=True)
f.write("aspirin.cif")

arr2 = strucio.load_structure("aspirin.cif", include_bonds=True)
print(arr2.bonds.as_array().shape) # (0, 3)

Am I missing something?

aspirin.txt

custom_bond_dict : dict (str -> dict ((str, str) -> int)), optional
A dictionary of dictionaries:
The inner dictionary maps tuples of two atom names
to their respective bond types (represented as integer) for .
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

word missing in docstring?

Copy link
Member Author

@padix-key padix-key May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, rather an entire sentence. Thanks for spotting this

@padix-key
Copy link
Member Author

I reproduced your problem. There are two issues here:

  • The intra-bond parsing code in get_structure() was placed so it only ran for AtomArrayStack. The latest commit fixed this.
  • SDF files do neither contain residue nor atom names, so res_name and atom_name are empty in the parsed AtomArray. However chem_comp_bond in PDBx requires these two columns for identifying bonded atoms. Hence, you need to assign them before writing the CIF file:
    import biotite.structure.io as strucio
    import biotite.structure.io.pdbx as pdbx
    
    arr = strucio.load_structure("aspirin.sdf")
    arr.res_name[:] = "ASP"
    # Probably you would want a better atom naming scheme
    arr.atom_name[:] = [f"A{i}" for i in range(arr.array_length())]
    
    f = pdbx.CIFFile()
    pdbx.set_structure(f, arr, include_bonds=True)
    f.write("aspirin_temp.cif")
    
    arr2 = strucio.load_structure("aspirin_temp.cif", include_bonds=True)

@cisert
Copy link

cisert commented May 29, 2024

Did the code snippet you posted above run for you? Pulling your changes and running locally, I got an error from the final line
TypeError: connect_via_residue_names() got an unexpected keyword argument 'custom_bond_dict'

@padix-key
Copy link
Member Author

Did the code snippet you posted above run for you? Pulling your changes and running locally, I got an error from the final line TypeError: connect_via_residue_names() got an unexpected keyword argument 'custom_bond_dict'

Did you reinstall the package? connect_via_residue_names() is part of a Cython module. Thus, the compiled module will still have the old version, if not explicitly reinstalled.

@padix-key padix-key merged commit c4d060a into biotite-dev:master May 29, 2024
20 checks passed
@padix-key padix-key deleted the pdbx branch May 31, 2024 12:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants