Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use absolute file paths in ITPParser #3108

Merged
merged 22 commits into from
Apr 23, 2021

Conversation

aditya-kamath
Copy link
Contributor

@aditya-kamath aditya-kamath commented Jan 24, 2021

Fixes #3037

Changes made in this Pull Request

  • ITPParser now accepts relative paths

PR Checklist

  • [] Tests?
  • [] Docs?
  • CHANGELOG updated?
  • Issue raised/referenced?

Copy link
Member

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs a test

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of extra things on top of @richardjgowers' comment

package/CHANGELOG Outdated Show resolved Hide resolved
package/MDAnalysis/topology/ITPParser.py Outdated Show resolved Hide resolved
@IAlibay
Copy link
Member

IAlibay commented Jan 24, 2021

This might need some rethinking, it's breaking CI: https://github.com/MDAnalysis/mdanalysis/pull/3108/checks?check_run_id=1757898726#step:10:999

@aditya-kamath
Copy link
Contributor Author

Hmm, I guess I should understand how these path strings are manipulated in general. This did seem to work when I tried to load a .ITP file from another folder.

@@ -268,7 +268,7 @@ def find_path(self, path):
path = path.name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A slightly different change seems to allow the testsuite to pass while still allowing relative file path handling in my hands:

--- a/package/MDAnalysis/topology/ITPParser.py
+++ b/package/MDAnalysis/topology/ITPParser.py
@@ -265,7 +265,7 @@ class GmxTopIterator:
             current_file = self.current_file
 
         try:
-            path = path.name
+            path = os.path.abspath(path.name)
         except AttributeError:
             pass

I think it addresses Irfan's comment as well

@codecov
Copy link

codecov bot commented Jan 25, 2021

Codecov Report

Merging #3108 (84d0fd4) into develop (8a5ef48) will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff            @@
##           develop    #3108   +/-   ##
========================================
  Coverage    92.81%   92.81%           
========================================
  Files          170      170           
  Lines        22801    22801           
  Branches      3239     3239           
========================================
  Hits         21162    21162           
  Misses        1591     1591           
  Partials        48       48           
Impacted Files Coverage Δ
package/MDAnalysis/topology/ITPParser.py 96.84% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8a5ef48...84d0fd4. Read the comment docs.

@aditya-kamath
Copy link
Contributor Author

@richardjgowers, To add a test, would it be okay to make another folder and show that it can access an ITP file? I'm not able to figure out a better test..

@IAlibay
Copy link
Member

IAlibay commented Mar 10, 2021

@MDAnalysis/gsoc-mentors can we get a follow-up review here?

@aditya-kamath can you fix the merge conflict and also maybe provide some example code for what you mean in #3108 (comment)? (that way it's easier for us to comment on how best to change it).

@aditya-kamath
Copy link
Contributor Author

I guess I'm just asking what kind of test should I provide in this case? I want to show that this parser can now access paths outside root directory. I tried this manually and it works but I'm thinking in a pytest context. :)

@hmacdope
Copy link
Member

In a pytest context you could try just providing any relative path to the ITP you are meant to be testing. This includes the ITPs in the /MDAnalysisTests/data/ folder. That's probably a good place to start.

@pep8speaks
Copy link

pep8speaks commented Mar 11, 2021

Hello @aditya-kamath! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-04-15 21:32:36 UTC

@aditya-kamath
Copy link
Contributor Author

I've added a test, and it seems to work fine when I do pytest test_itp.py. But, this seems to fail? I've also checked that it does indeed work with relative paths.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test, and it seems to work fine when I do pytest test_itp.py. But, this seems to fail? I've also checked that it does indeed work with relative paths.

See here: https://github.com/MDAnalysis/mdanalysis/pull/3108/checks?check_run_id=2089681447#step:10:366

It doesn't seem like you're creating the temporary ITP file?

class Testrelativepath:

def test_relpath(self):
u = mda.Universe('../data/gromacs_ala10.itp', format='ITP')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this itp file being generated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I get it. I guess I need to load this file first..!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aditya-kamath Python treats files with relative paths, as relative to whichever directory you've called the test from. Users can call pytest from anywhere in their system, which is why we make all our paths absolute in the datafiles.py file and import them when we want to use our test files. It might be easier for you to write a temporary ITP file using pytest's tools that only exists for the scope of this function or module, and try to read that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lilyminium , This really helps. Now, I understand how to write this test! :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now created a temporary folder where an itp file is written and have accessed it with the parser. Although, I am still not completely sure if this asserts that itp parser can access relative locations.?

Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this, @aditya-kamath. I just have a few comments :)

@@ -335,3 +335,14 @@ def test_missing_endif(self):
with pytest.raises(IOError):
with self.parser(ITP_no_endif) as p:
top = p.parse(include_dir=GMX_DIR)

class Testrelativepath:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow Python convention and use camel casing here if you create a class. However, since you only have one tests, I would suggest that you don't need a class for this test. Instead, you could structure test_relpath as a standalone function.

class Testrelativepath:

def test_relpath(self, tmp_path):
content="[ atoms ]\n" +\
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be neater here to use """ multiline strings. I don't think the indentation will matter to the ITPParser.

d.mkdir()
p = d / "test.itp"
p.write_text(content)
u = mda.Universe(tmp_path / "sub/test.itp", format='ITP')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would still be an absolute path. Instead, I would suggest using the tmpdir fixture, creating a new directory inside that temporary one, and making sure that the relative path you test looks something like "../test.itp".

Copy link
Contributor Author

@aditya-kamath aditya-kamath Mar 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lilyminium, I've been looking at pytest tools and os.paths. I see that there's a command called relative_to in the pathlib module which computes a relative path to a specified directory. My approach would have been to make a tmpdir and set that as 'observer' directory and use that command (relative_to) to find the path that it sees.

As it turns out, the relative_to command only works when computing paths from parent directory reference, not sibling directory. So, I can't compute something like '../dir'. The second thing is even if I manually create a relative string which tmpdir would use to read my temporary itp file, I am not sure how to ask pytest to run from my tmpdir while performing this test.

I might have missed something here, especially because I am new to pytest, but I am happy to hear any advice you have here!

Copy link
Member

@lilyminium lilyminium Mar 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aditya-kamath,

If you go into your temporary directory, write a file called "test.itp", make a new subdirectory, and change into that directory, you should be able to test that file with the path "../test.itp". You might like the tmpdir.as_cwd context manager for that.

Edit: see this StackOverflow question for examples and notes https://stackoverflow.com/questions/55658866/how-to-get-the-temporary-path-using-pytest-tmpdir-as-cwd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great advice! I figured the tests out. Just a note that I've done the reverse of what you might expect. Instead of testing a relative path and then converting it to a relative string. I've first tested a relative string and then followed it up by converting it to a path. Hope that's alright!

@@ -265,10 +265,9 @@ def find_path(self, path):
current_file = self.current_file

try:
path = path.name
path = os.path.abspath(path.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This try/except is to check whether path is a string (which triggers the AttributeError) or a Path object from pathlib (so Path.path is the string version). Does your fix still work if a relative string path is passed in instead, triggering the AttributeError?

By the way, your test should check for both, just in case :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lilyminium, that's a good observation! However, if the user isn't providing a string and triggering this AttributeError, it's probably doing the right thing by just passing and eventually raising an IO error. I can write a test to check if an IOerror is raised when a relative string is passed but not as a string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @aditya-kamath,

if the user isn't providing a string and triggering this AttributeError

If a string is provided and the AttributeError is triggered, does your fix still work? The IOError test is probably not necessary. However, your test should check that your fix applies to both a relative Path (https://docs.python.org/3/library/pathlib.html) and a relative string.

(so Path.path is the string version)

This was a typo, sorry. Path.name returns the base name of the Path, as a string.

Copy link
Contributor Author

@aditya-kamath aditya-kamath Mar 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, the answer to if it will work when there's an Attribute error is no! :(
I'm assuming that the situation when this will happen is when os.path.abspath is not able to convert a relative string to an absolute path. I'll write the two situations in my test to see if there's a case when this might happen! :D

@@ -21,6 +21,7 @@ The rules for this file:
* 2.0.0

Fixes
* ITPParser now accepts relative paths (Issue #3037)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* ITPParser now accepts relative paths (Issue #3037)
* ITPParser now accepts relative paths (Issue #3037, PR #3108)

@lilyminium
Copy link
Member

Thanks for adding the new tests, @aditya-kamath. It's unexpected that they pass, as I expect not accounting for the AttributeError to break the test. On further investigation, the ITPParser could always deal with relative paths when used to create a Universe. I tested this by reverting your os.path.abspath change; the tests still pass. However, what it cannot deal with is relative paths when they're #included in earlier ITPs. That means the test case should actually be a file like this:

#include "../sub1/atoms.itp"
        
[ atoms ]
 1      H      1    SOL    HW1      1       0.41    1.00800

with a corresponding atoms.itp. This means that instead of writing a temporary file, you could actually just include a new ITP file in the data directory and as a variable in datafiles.py that references an upper-level ITP file. It's up to you which one you choose.

@richardjgowers did you first find the bug when one of your ITP files #included an upper level ITP file, or was the relative path included in some other way?

@aditya-kamath
Copy link
Contributor Author

Okay, sounds good! I'll try to get the parser working for all these cases. The next steps may be to emit warnings for broken files...

@IAlibay IAlibay added this to the 2.0 milestone Apr 6, 2021
@aditya-kamath
Copy link
Contributor Author

Btw, @lilyminium , I still believe this fix is working well as is. I tested it to see if it could parse files which #included other files and while it could not do it before, it seems to be able to parse relative paths which '#include other paths. I've written a test for this which creates three temporary folders and while located in one of the folders, tries to parse an itp file in it's sibling directory which #includes a second itp file. I've run pytest fine with this but it's breaking CI so I guess I missed something..

@lilyminium
Copy link
Member

@aditya-kamath your tests still pass even without your changes. It's often good to come up with a falsifying example to start off with. Can you get something like this test to pass?

def test_relative_path(self, tmpdir):
        test_itp_content = '#include "../atoms.itp"'
        atoms_itp_content = """
        [ moleculetype ]
        UNK 3

        [ atoms ]
        1      H      1    SOL    HW1      1       0.41    1.00800
        """

        with tmpdir.as_cwd():
            with open("atoms.itp", "w") as f:
                f.write(atoms_itp_content)
            subdir = tmpdir.mkdir("subdir")
            with subdir.as_cwd():
                with open("test.itp", "w") as f:
                    f.write(test_itp_content)
                subsubdir = subdir.mkdir("subsubdir")
                with subsubdir.as_cwd():
                    u = mda.Universe("../test.itp")
                    assert len(u.atoms) == 1

Having a look at how the function gets the output dir_path may help.

@aditya-kamath
Copy link
Contributor Author

Thanks for the test @lilyminium . It seems to pass on my computer, I've added it to this branch so let's see if it passes the checks here.

@aditya-kamath
Copy link
Contributor Author

All checks passed with the new test, there might yet be a case situation we are missing!

@lilyminium
Copy link
Member

Awesome, thank you @aditya-kamath -- I think this fix covers most cases then! Could you please just address the PEP8 comments by the bot in #3108 (comment) and then I think this is good to go!

Sorry for the delay in replying, got caught up in other things. Please just ping me when you're happy.

@aditya-kamath
Copy link
Contributor Author

aditya-kamath commented Apr 15, 2021

Great! :D
I've fixed most of the PEP8 errors. The last couple of them is because I've used a class to put the test functions in. This is not really necessary, mostly for an organized code structure.
@lilyminium

Copy link
Member

@lilyminium lilyminium left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bot is just asking you to add an extra line / remove some lines in the class :)

aditya-kamath and others added 2 commits April 15, 2021 17:32
Co-authored-by: Lily Wang <[email protected]>
Co-authored-by: Lily Wang <[email protected]>
@aditya-kamath
Copy link
Contributor Author

It's looking good @lilyminium!

@lilyminium
Copy link
Member

lilyminium commented Apr 15, 2021

@richardjgowers @IAlibay could you guys please have another look at this? If not, I'll dismiss the reviews as stale.

@lilyminium lilyminium dismissed stale reviews from IAlibay and richardjgowers April 23, 2021 11:28

Dismissing old review as stale.

@lilyminium lilyminium changed the title Itp parserfix Use absolute file paths in ITPParser Apr 23, 2021
@lilyminium lilyminium merged commit b82635a into MDAnalysis:develop Apr 23, 2021
@lilyminium
Copy link
Member

Thank you @aditya-kamath for your patience on this on this very helpful fix!

lilyminium added a commit to lilyminium/mdanalysis that referenced this pull request Apr 28, 2021
jbarnoud added a commit that referenced this pull request May 3, 2021
* added get_connections

* modified tests for atoms.bonds/angles/dihedrals etc

* modified parsers and things to use get_connections or bonds

* updated CHANGELOG

* pep8

* undo half of PR 3160

* add intra stuff

* Update package/MDAnalysis/core/groups.py

Co-authored-by: Jonathan Barnoud <[email protected]>

* tighten up base class checking

* update docstring

* suppres warnings

* Use absolute file paths in ITPParser (#3108)

Fixes #3037

Co-authored-by: Lily Wang <[email protected]>

* Adds aromaticity and Gasteiger charges guessers (#2926)

Towards #2468 

## Work done in this PR
* Add aromaticity and Gasteiger charges guessers which work via the RDKIT converter.

* BLD: handle gcc on MacOS (#3234)

Fixes #3109 

## Work done in this PR
* gracefully handle the case where `gcc` toolchain
in use on MacOS has been built from source using `clang`
by `spack` (so it really is `gcc` in use, not `clang`)

## Notes
* we could try to add regression testing, but a few problems:
- `using_clang()` is inside `setup.py`, which probably
can't be safely imported because it has unguarded statements/
code blocks that run right away
- testing build issues is typically tricky with mocking, etc.
(though in this case, probably just need to move `using_clang()`
somewhere else and then test it against a variety of compiler
metadata strings

* Remove ParmEd Timestep writing "support" (#3240)

Fixes #3031

* Adding py3.9 to gh actions CI matrix (#3245)

* Fixes #2974
* Python 3.9 officially supported
* Add  Python 3.9 to testing matrix
* Adds macOS CI entry, formalises 3.9 support

* fix changelog

* special metaclass

* move function down

* tidy code

Co-authored-by: Jonathan Barnoud <[email protected]>
Co-authored-by: Aditya Kamath <[email protected]>
Co-authored-by: Cédric Bouysset <[email protected]>
Co-authored-by: Tyler Reddy <[email protected]>
Co-authored-by: Irfan Alibay <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ITPParser doesn't like some relative paths
9 participants