Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ClusteringMethods don't need to return details #2575

Closed
lilyminium opened this issue Mar 2, 2020 · 9 comments · Fixed by #2620
Closed

ClusteringMethods don't need to return details #2575

lilyminium opened this issue Mar 2, 2020 · 9 comments · Fixed by #2620

Comments

@lilyminium
Copy link
Member

lilyminium commented Mar 2, 2020

Expected behavior

Everything returned by ClusteringMethod.__call__ contains information that is used.

Actual behavior

ClusteringMethod.__call__ returns clusters, details. details is always an empty dictionary. It's not used by cluster and is not even in the documentation:

def __call__(self, x):
"""
Parameters
----------
x
either trajectory coordinate data (np.array) or an
encore.utils.TriangularMatrix, encoding the conformational
distance matrix
Returns
-------
numpy.array
list of cluster indices
"""

The documentation is also inaccurate: clusters is an array of the centroid frame indices of length #elements, as returned by encode_centroid_info:

def encode_centroid_info(clusters, cluster_centers_indices):
"""
Adjust cluster indices to include centroid information
as described in documentation for ClusterCollection
"""
values, indices = np.unique(clusters, return_inverse=True)
for c_center in cluster_centers_indices:
if clusters[c_center] != c_center:
values[indices[c_center]] = c_center
return values[indices]

Code to reproduce the behavior

As shown below, cluster does not use the empty details dictionary.

ccs = [ClusterCollection(clusters[1][0],
metadata=metadata) for clusters in results]

Currently version of MDAnalysis

  • Which version are you using? (run python -c "import MDAnalysis as mda; print(mda.__version__)") 0.20.2-dev
  • Which version of Python (python -V)?
  • Which operating system?

Suggested fix
Remove details from all the ClusteringMethods and update the documentation. Run the tests to make sure it doesn't affect anything else.

@shfrz
Copy link
Contributor

shfrz commented Mar 2, 2020

Hi @lilyminium I would want to take this one!

@lilyminium
Copy link
Member Author

@mtiberti are you happy with @shfrz fixing this issue?

@orbeckst
Copy link
Member

orbeckst commented Mar 5, 2020

@shfrz please feel free to work on this issue. @mtiberti can always review.

@shfrz
Copy link
Contributor

shfrz commented Mar 6, 2020

@orbeckst sure, thank you!

@shfrz shfrz mentioned this issue Mar 6, 2020
4 tasks
@shfrz
Copy link
Contributor

shfrz commented Mar 6, 2020

@MDAnalysis/gsoc-mentors could you please review my PR? Thanks!

@shfrz
Copy link
Contributor

shfrz commented Mar 7, 2020

Hi, @lilyminium I implemented the suggested fix but 8 out of 12 tests are failing!

@kpiyush04
Copy link

Hi! @lilyminium @orbeckst
I want to work on this issue

@IAlibay
Copy link
Member

IAlibay commented Mar 8, 2020

@kpiyush04, there is already a PR by @shfrz aiming to deal with this issue (#2585). I would suggest possibly going for a separate GSOC starter issue as #2585 will be prioritised. If the PR stalls then it will be closed and we will look for alternatives.

@kpiyush04
Copy link

ok thanks a lot

IAlibay pushed a commit that referenced this issue Mar 19, 2020
Fixes issue #2575 

## Work done in this PR:

Here the always-empty `details` dictionary which used to be returned by ClusterMethods has been removed.

## Files changed

  - ClusterMethods.py (removed details).
  - test_encore.py (removed details).
  - cluster.py (removed details).
  - AUTHORS and CHANGELOG updated accordingly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants