Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HETATM record type written out #2880

Merged
merged 19 commits into from
Aug 14, 2020
Merged
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions package/AUTHORS
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,7 @@ Chronological list of authors
- Andrea Rizzi
- William Glass
- Marcello Sega
- Mieczyslaw Torchala

External code
-------------
Expand Down
5 changes: 4 additions & 1 deletion package/CHANGELOG
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,14 @@ The rules for this file:

------------------------------------------------------------------------------
??/??/?? tylerjereddy, richardjgowers, IAlibay, hmacdope, orbeckst, cbouy,
lilyminium, daveminh, jbarnoud, yuxuanzhuang, VOD555, ianmkenney
lilyminium, daveminh, jbarnoud, yuxuanzhuang, VOD555, ianmkenney,
mieczyslaw

* 2.0.0

Fixes
* Instead of using ATOM for both ATOM and HETATM, HETATM record type
mieczyslaw marked this conversation as resolved.
Show resolved Hide resolved
keyword is properly written out by PDB writer (Issue #1753, PR #2880)
* Bond attribute is no longer set if PDB file contains no CONECT records
(Issue #2832)
* ResidueAttrs now have Atom as a target class by default; ICodes and
Expand Down
84 changes: 42 additions & 42 deletions package/MDAnalysis/coordinates/PDB.py
Original file line number Diff line number Diff line change
Expand Up @@ -472,10 +472,17 @@ class PDBWriter(base.WriterBase):

.. _`PDB 3.2 standard`:
http://www.wwpdb.org/documentation/file-format-content/format32/v3.2.html
.. _MODEL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#MODEL
.. _ENDMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ENDMDL
.. _ATOM: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ATOM
mieczyslaw marked this conversation as resolved.
Show resolved Hide resolved
.. _COMPND: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#COMPND
.. _CONECT: http://www.wwpdb.org/documentation/file-format-content/format32/sect10.html#CONECT

.. _END: http://www.wwpdb.org/documentation/file-format-content/format32/sect11.html#END
.. _ENDMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ENDMDL
.. _HEADER: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#HEADER
.. _HETATM: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#HETATM
.. _MODEL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#MODEL
.. _NUMMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#NUMMDL
.. _REMARKS: http://www.wwpdb.org/documentation/file-format-content/format32/remarks.html
.. _TITLE: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#TITLE

Note
----
Expand All @@ -493,6 +500,11 @@ class PDBWriter(base.WriterBase):
are not set (:code:`None` or :code:`np.zeros(6)`,
see Issue #2698).

When the record_types attribute is present (e.g. Universe object was
mieczyslaw marked this conversation as resolved.
Show resolved Hide resolved
created by loading a PDB file), ATOM_ and HETATM_ record type keywords
are written out accordingly. Otherwise, the ATOM_ record type is the
default output.

See Also
--------
This class is identical to :class:`MultiPDBWriter` with the one
Expand Down Expand Up @@ -528,6 +540,11 @@ class PDBWriter(base.WriterBase):
"{chainID:1s}{resSeq:4d}{iCode:1s}"
" {pos[0]:8.3f}{pos[1]:8.3f}{pos[2]:8.3f}{occupancy:6.2f}"
"{tempFactor:6.2f} {segID:<4s}{element:>2s}\n"),
'HETATM': (
"HETATM{serial:5d} {name:<4s}{altLoc:<1s}{resName:<4s}"
"{chainID:1s}{resSeq:4d}{iCode:1s}"
" {pos[0]:8.3f}{pos[1]:8.3f}{pos[2]:8.3f}{occupancy:6.2f}"
"{tempFactor:6.2f} {segID:<4s}{element:>2s}\n"),
'REMARK': "REMARK {0}\n",
'COMPND': "COMPND {0}\n",
'HEADER': "HEADER {0}\n",
Expand Down Expand Up @@ -608,10 +625,6 @@ def __init__(self, filename, bonds="conect", n_atoms=None, start=0, step=1,
multi frame PDB file in which frames are written as MODEL_ ... ENDMDL_
records. If ``None``, then the class default is chosen. [``None``]

.. _CONECT: http://www.wwpdb.org/documentation/file-format-content/format32/sect10.html#CONECT
.. _MODEL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#MODEL
.. _ENDMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ENDMDL

"""
# n_atoms = None : dummy keyword argument
# (not used, but Writer() always provides n_atoms as the second argument)
Expand Down Expand Up @@ -668,9 +681,6 @@ def _write_pdb_header(self):
unitary values (cubic box with sides of 1 Å) if unit cell dimensions
are not set.

.. _COMPND: http://www.wwpdb.org/documentation/file-format-content/format33/sect2.html#COMPND
.. _REMARKS: http://www.wwpdb.org/documentation/file-format-content/format33/remarks.html

.. versionchanged: 1.0.0
Fix writing of PDB file without unit cell dimensions (Issue #2679).
If cell dimensions are not found, unitary values (cubic box with
Expand Down Expand Up @@ -981,19 +991,21 @@ def _write_timestep(self, ts, multiframe=False):
:class:`PDBWriter` is in single frame mode and no MODEL_
records are written.

.. _MODEL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#MODEL
.. _ENDMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ENDMDL
.. _NUMMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#NUMMDL

.. versionchanged:: 0.7.6
The *multiframe* keyword was added, which completely determines if
MODEL records are written. (Previously, this was decided based on the
underlying trajectory and only if ``len(traj) > 1`` would MODEL records
have been written.)
MODEL_ records are written. (Previously, this was decided based on
the underlying trajectory and only if ``len(traj) > 1`` would
MODEL records have been written.)

.. versionchanged:: 1.0.0
ChainID now comes from the last character of segid, as stated in the documentation.
An indexing issue meant it previously used the first charater (Issue #2224)
ChainID now comes from the last character of segid, as stated in
the documentation. An indexing issue meant it previously used the
first charater (Issue #2224)

.. versionchanged:: 2.0.0
When only record_type attribute is present, instead of using ATOM_
for both ATOM_ and HETATM_, HETATM_ record types are properly written
out (Issue #1753).

"""
atoms = self.obj.atoms
Expand Down Expand Up @@ -1028,6 +1040,7 @@ def get_attr(attrname, default):
occupancies = get_attr('occupancies', 1.0)
tempfactors = get_attr('tempfactors', 0.0)
atomnames = get_attr('names', 'X')
record_types = get_attr('record_types', 'ATOM')

for i, atom in enumerate(atoms):
vals = {}
Expand All @@ -1044,17 +1057,21 @@ def get_attr(attrname, default):
vals['segID'] = segids[i][:4]
vals['element'] = guess_atom_element(atomnames[i].strip())[:2]

# .. _ATOM: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ATOM
self.pdbfile.write(self.fmt['ATOM'].format(**vals))
# record_type attribute, if exists, can be ATOM or HETATM
try:
self.pdbfile.write(self.fmt[record_types[i]].format(**vals))
except KeyError:
errmsg = (f"Found {record_types[i]} for the record type, but "
f"only allowed types are ATOM or HETATM")
raise ValueError(errmsg) from None

if multiframe:
self.ENDMDL()
self.frames_written += 1

def HEADER(self, trajectory):
"""Write HEADER_ record.

.. _HEADER: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html#HEADER

.. versionchanged:: 0.20.0
Strip `trajectory.header` since it can be modified by the user and should be
sanitized (Issue #2324)
Expand All @@ -1067,23 +1084,16 @@ def HEADER(self, trajectory):
def TITLE(self, *title):
"""Write TITLE_ record.

.. _TITLE: http://www.wwpdb.org/documentation/file-format-content/format32/sect2.html

"""
line = " ".join(title) # TODO: should do continuation automatically
self.pdbfile.write(self.fmt['TITLE'].format(line))

def REMARK(self, *remarks):
"""Write generic REMARK_ record (without number).
"""Write generic REMARKS_ record (without number).

Each string provided in *remarks* is written as a separate REMARK_
Each string provided in *remarks* is written as a separate REMARKS_
record.

See also `REMARK (update)`_.

.. _REMARK: http://www.wwpdb.org/documentation/file-format-content/format32/remarks1.html
.. _REMARK (update): http://www.wwpdb.org/documentation/file-format-content/format32/remarks2.html

"""
for remark in remarks:
self.pdbfile.write(self.fmt['REMARK'].format(remark))
Expand Down Expand Up @@ -1113,10 +1123,6 @@ def MODEL(self, modelnumber):
standard (i.e., 4 digits). If frame numbers are larger than
9999, they will wrap around, i.e., 9998, 9999, 0, 1, 2, ...


.. _MODEL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#MODEL


.. versionchanged:: 0.19.0
Maximum model number is enforced.

Expand All @@ -1131,8 +1137,6 @@ def END(self):
method right before closing the file it is recommended to *not* call
:meth:`~PDBWriter.END` explicitly.

.. _END: http://www.wwpdb.org/documentation/file-format-content/format32/sect11.html#END

"""
if not self.has_END:
# only write a single END record
Expand All @@ -1142,16 +1146,12 @@ def END(self):
def ENDMDL(self):
"""Write the ENDMDL_ record.

.. _ENDMDL: http://www.wwpdb.org/documentation/file-format-content/format32/sect9.html#ENDMDL

"""
self.pdbfile.write(self.fmt['ENDMDL'])

def CONECT(self, conect):
"""Write CONECT_ record.

.. _CONECT: http://www.wwpdb.org/documentation/file-format-content/format32/sect10.html#CONECT

"""
conect = ["{0:5d}".format(entry + 1) for entry in conect]
conect = "".join(conect)
Expand Down
71 changes: 67 additions & 4 deletions testsuite/MDAnalysisTests/coordinates/test_pdb.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,12 +20,12 @@
# MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics Simulations.
# J. Comput. Chem. 32 (2011), 2319--2327, doi:10.1002/jcc.21787
#
import pytest
from io import StringIO
import os
from io import StringIO

import MDAnalysis as mda
import numpy as np
import pytest
from MDAnalysisTests import make_Universe
from MDAnalysisTests.coordinates.base import _SingleFrameReader
from MDAnalysisTests.coordinates.reference import (RefAdKSmall,
Expand All @@ -36,7 +36,8 @@
INC_PDB, PDB_xlserial, ALIGN, ENT,
PDB_cm, PDB_cm_gz, PDB_cm_bz2,
PDB_mc, PDB_mc_gz, PDB_mc_bz2,
PDB_CRYOEM_BOX, MMTF_NOCRYST)
PDB_CRYOEM_BOX, MMTF_NOCRYST,
PDB_HOLE, mol2_molecule)
from numpy.testing import (assert_equal,
assert_array_almost_equal,
assert_almost_equal)
Expand Down Expand Up @@ -188,6 +189,14 @@ def universe2(self):
def universe3(self):
return mda.Universe(PDB)

@pytest.fixture
def universe4(self):
return mda.Universe(PDB_HOLE)

@pytest.fixture
def universe5(self):
return mda.Universe(mol2_molecule)

@pytest.fixture(params=[
[PDB_CRYOEM_BOX, np.zeros(6)],
[MMTF_NOCRYST, None]
Expand Down Expand Up @@ -221,7 +230,6 @@ def u_no_resids(self):
def u_no_names(self):
return make_Universe(['resids', 'resnames'], trajectory=True)


def test_writer(self, universe, outfile):
"Test writing from a single frame PDB file to a PDB file." ""
universe.atoms.write(outfile)
Expand Down Expand Up @@ -446,6 +454,61 @@ def test_stringio_outofrange(self, universe3):
with mda.coordinates.PDB.PDBWriter(outstring) as writer:
writer.write(u.atoms)

def test_hetatm_written(self, universe4, tmpdir, outfile):
"""
Checks that HETATM record types are written.
"""

u = universe4
u_hetatms = u.select_atoms("resname ETA and record_type HETATM")
assert_equal(len(u_hetatms), 8)

u.atoms.write(outfile)
written = mda.Universe(outfile)
written_atoms = written.select_atoms("resname ETA and "
"record_type HETATM")

assert len(u_hetatms) == len(written_atoms), "mismatched HETATM number"
assert_almost_equal(u_hetatms.atoms.positions, written_atoms.atoms.positions)
mieczyslaw marked this conversation as resolved.
Show resolved Hide resolved

def test_default_atom_record_type_written(self, universe5, tmpdir, outfile):
"""
Checks that ATOM record types are written when there is no
record_type attribute.
"""

u = universe5

expected_msg = ("Found no information for attr: "
"'record_types' Using default value of 'ATOM'")

with pytest.warns(UserWarning, match=expected_msg):
u.atoms.write(outfile)

written = mda.Universe(outfile)
assert len(u.atoms) == len(written.atoms), "mismatched number of atoms"

atms = written.select_atoms("record_type ATOM")
assert len(atms.atoms) == len(u.atoms), "mismatched ATOM number"

hetatms = written.select_atoms("record_type HETATM")
assert len(hetatms.atoms) == 0, "mismatched HETATM number"

def test_abnormal_record_type(self, universe5, tmpdir, outfile):
"""
Checks whether KeyError is raised when record type is
neither ATOM or HETATM.
"""
u = universe5
u.add_TopologyAttr('record_type', ['ABNORM']*len(u.atoms))

expected_msg = ("Found 'ABNORM' for record type, but allowed "
"types are ATOM or HETATM")

with pytest.raises(ValueError):
mieczyslaw marked this conversation as resolved.
Show resolved Hide resolved
with pytest.warns(UserWarning, match=expected_msg):
u.atoms.write(outfile)


class TestMultiPDBReader(object):
@staticmethod
Expand Down