Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for source roots in V2 ./pants test #7696

Merged

Conversation

Eric-Arellano
Copy link
Contributor

@Eric-Arellano Eric-Arellano commented May 9, 2019

Problem

Running tests over our codebase, e.g. ./pants --no-v1 --v2 test tests/python/pants_test/util:strutil, will fail because it won't be able to resolve the module pants.

$ ./pants --no-v1 --v2 test tests/python/pants_test/util:strutil
tests/python/pants_test/util:strutil stdout:
============================= test session starts ==============================
platform darwin -- Python 3.6.8, pytest-3.6.4, py-1.8.0, pluggy-0.7.1
rootdir: /Users/eric/DocsLocal/code/projects/pants/.pants.d/process-execution8hQxGD, inifile:
plugins: cov-2.4.0, timeout-1.2.1
collected 0 items / 1 errors

==================================== ERRORS ====================================
________ ERROR collecting tests/python/pants_test/util/test_strutil.py _________
ImportError while importing test module '/Users/eric/DocsLocal/code/projects/pants/.pants.d/process-execution8hQxGD/tests/python/pants_test/util/test_strutil.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
tests/python/pants_test/util/test_strutil.py:10: in <module>
    from pants.util.strutil import camelcase, ensure_binary, ensure_text, pluralize, strip_prefix
E   ModuleNotFoundError: No module named 'pants'
!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!
=========================== 1 error in 0.14 seconds ============================


tests/python/pants_test/util:strutil                                            .....   FAILURE
Tests failed

The issue is that we are not properly handling source roots: https://www.pantsbuild.org/setup_repo.html#source-roots.

Solution

  • Link to the new DirectoryWithPrefixToStrip builtin Rust rule from Allow Directories to be un-nested #7699 to strip the sources' prefixes before getting passed to Pytest.
    • This means that Pytest never needs to know what a source root is.
    • Uses the SourceRootConfig subsystem to dynamically determine that file's source root prefix.
  • Create empty __init__.py files for each package.
    • For some reason, this was necessary to get the modules recognized, even on Python 3 despite its support of namespace packages.
    • We do not add any value to __init__.py, which is standard to use an empty __init__.py in Python. However, this contrasts with some V1 code like
      if missing_pkg_files:
      with temporary_file() as ns_package:
      ns_package.write(b'__import__("pkg_resources").declare_namespace(__name__)')
      ns_package.flush()
      for missing_pkg_file in missing_pkg_files:
      self._builder.add_source(ns_package.name, missing_pkg_file)
      . To keep things simple for now, we don't add this line—everything is working without it. Down the road, this line may potentially be necessary to add but for now there is no reason.

Result

./pants --no-v1 --v2 test tests/python/pants_test/util:strutil now passes, along with the new integration test.

This does mean that Pytest's output will be changed when using V2 in that it now only outputs the stripped paths. For now, this is okay because the V2 test runner will still print the whole path in the surrounding log and it is not worth the effort to re-inject the full path in Pytest's output.

@Eric-Arellano Eric-Arellano changed the title Add support for absolute imports in V2 ./pants test WIP: Add support for absolute imports in V2 ./pants test May 9, 2019
commit 0a38b338ba5c3d9dd0bc080fa477e8b74e9850e6
Merge: c0bfc9e 9ca32a9
Author: Eric Arellano <[email protected]>
Date:   Fri May 10 11:20:51 2019 -0700

    Merge branch 'dwagnerhall/strip-prefix' of https://github.com/twitter/pants into twitter-dwagnerhall/strip-prefix

commit c0bfc9e
Author: Eric Arellano <[email protected]>
Date:   Fri May 10 10:49:18 2019 -0700

    Output stderr in V2 test rule (pantsbuild#7694)

    ### Problem
    Implements pantsbuild#7388. Swallowing stderr makes debugging very difficult.

    Important clarification: failing tests will write to stdout, meaning that we do already output any message captured by Pytest. Instead, this PR is meant to log issues that occur before Pytest executes, such as issues from Pants.

    ### Solution
    Follow the convention established by pantsbuild#7676 to output `{address} stderr:`, followed by the stderr.

commit 55e5721
Author: Daniel Wagner-Hall <[email protected]>
Date:   Fri May 10 17:32:03 2019 +0100

    Remove now-unused Path type (pantsbuild#7701)

    The engine lost Paths from its Snapshots at some point, and we didn't clean up.

commit 9ca32a9
Author: Daniel Wagner-Hall <[email protected]>
Date:   Fri May 10 10:59:41 2019 +0100

    Allow Directories to be un-nested

commit 5eed2e7
Author: Eric Arellano <[email protected]>
Date:   Thu May 9 17:38:03 2019 -0700

    Mark build-support Python files as Pants targets to lint build-support (pantsbuild#7633)

    ### Converting to Pants targets
    Now that we have two Python scripts, and may have more scripts in the future, there is value in linting our script code.

    By making these two scripts as targets, we can now use `./pants fmt`, `./pants lint`, and `./pants mypy` on the build-support folder for little cost.

    The pre-commit check will check `./pants fmt` and `./pants lint` automatically on the `build-support` Python files now. It will not do the header check until we explicitly add the folder.

    ### Add `common.py`
    We also add `common.py` to reduce duplication and simplify writing scripts.

    See the `NB` in that file about why we only use the stdlib for that file.
The Pex looks correct now... But it still does not pass!
We never include it in actual BUILD files. This way we test that the __init__.py injection actually works.
@Eric-Arellano Eric-Arellano changed the title WIP: Add support for absolute imports in V2 ./pants test Add support for source roots in V2 ./pants test May 14, 2019
@Eric-Arellano Eric-Arellano marked this pull request as ready for review May 14, 2019 00:33
Copy link
Contributor

@cosmicexplorer cosmicexplorer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not add any value to init.py, unlike...to keep things simple for now. This might need to be changed down the road.

When might it need to be changed? Could we have an issue link?

This does mean that Pytest's output will be changed

This sounds like it means ./pants test output is going to be changed for everyone, but I think this just means for the v2 runner right? Maybe I'm picking on words, but since this is in the commit message, I might say something about being at feature parity vs v1 pytest to be more clear.



testprojects/tests/python/pants/dummies:target_with_source_dep_relative_import ..... SUCCESS
""")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer keeping parentheses joined here instead of spreading them over a new line, but I think the style is inconsistent all over the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. For now, this is using the file convention which I'm going to stick with.

This is precisely why I want to add Black to our codebase - either style works just fine, but the issue is that this statement is unfortunately true :/

I think the style is inconsistent all over the repo.

output_files=injected_inits,
description="Inject empty __init__.py into all packages without one already.",
input_files=sources_digest,
env={"PATH": os.environ["PATH"]}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really, really like this to be taken out either into its own v2 rule with e.g. a @rule(FallibleExecuteProcessResult, [LocalPATHExecuteProcessRequest]) which wraps a normal ExecuteProcessRequest, does some transformation (setting env={"PATH": os.environ["PATH"]}), then does a yield Get(FallIbleExecuteProcessResult, ExecuteProcessRequest, modified_exe_request). I would like this to be done in this PR if possible, as I would prefer to avoid having references to the local PATH start sprouting unchecked (and I would love to be able to modify just a single @rule in order to modify the behavior of all rules which use anything from the host PATH!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to do this in a followup, but request not blocking on this here. 3 V2 Pytest PRs are blocked by this one landing, which block any progress on the remoting project.

Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not add any value to init.py, unlike...to keep things simple for now. This might need to be changed down the road.

When might it need to be changed? Could we have an issue link?

For now, it does not appear we will ever need to add it. Everything works just fine with empty __init__.py and the default in Python is to use an empty file. I only mentioned the divergence so that someone in the future knows this could potentially be a source of issues, even though there is no evidence at the moment that it is.

This does mean that Pytest's output will be changed

This sounds like it means ./pants test output is going to be changed for everyone, but I think this just means for the v2 runner right? Maybe I'm picking on words, but since this is in the commit message, I might say something about being at feature parity vs v1 pytest to be more clear.

My bad - reworded to explain that this impact is only for V2.

--

Thanks for the review!



testprojects/tests/python/pants/dummies:target_with_source_dep_relative_import ..... SUCCESS
""")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. For now, this is using the file convention which I'm going to stick with.

This is precisely why I want to add Black to our codebase - either style works just fine, but the issue is that this statement is unfortunately true :/

I think the style is inconsistent all over the repo.

output_files=injected_inits,
description="Inject empty __init__.py into all packages without one already.",
input_files=sources_digest,
env={"PATH": os.environ["PATH"]}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy to do this in a followup, but request not blocking on this here. 3 V2 Pytest PRs are blocked by this one landing, which block any progress on the remoting project.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good :) Thanks!

Digest, MergedDirectories(directories=tuple(all_sources_digests)),
)

# TODO(7716): add a builtin rule to go from MergedDirectories->Snapshot or Digest->Snapshot.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh! I'm pretty sure we actually can do this now! (There were reasons we couldn't before, but I think they're all gone).

I'd be very happy to talk you through how to do this - it would look very similar to #7699 but instead of calling a new fs::Snapshot::strip_prefix function it would call the existing fs::Snapshot::from_digest function.

Definitely doesn't need to block this PR, but would be good to do soon :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yay! I've always wanted to write Rust, so with a little guidance would be happy to do this.

Copy link
Contributor Author

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple distinct followups I'd like to do from this PR:

  1. Generalize __init__.py creation. Add V2 rule to inject __init__.py into directory digests #7715
  2. Add builtin rule Digest -> Snapshot and/or MergedDirectories->Snapshot. Provide in V2 a builtin rule to go from Digest -> Snapshot #7716
  3. Add builtin rule FilesContent -> Snapshot so that we can stop using touch. Add a V2 builtin rule to go from FilesContent -> Snapshot #7718
  4. Generalize prefix stripping into a rule. Consider how to relativize target Snapshots to source roots #7697

I'd like to do these in a followup if possible, because each is a fairly substantial change and I want them to get dedicated reviews and have the space / time to properly iterate on things like testing each of them. This PR is already pretty substantial in scope + blocks several others, so I would prefer to land as is and do followups.

Digest, MergedDirectories(directories=tuple(all_sources_digests)),
)

# TODO(7716): add a builtin rule to go from MergedDirectories->Snapshot or Digest->Snapshot.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yay! I've always wanted to write Rust, so with a little guidance would be happy to do this.

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome :) I'm happy for all the follow-ups to be follow-ups!

So close to killing Py2..
@Eric-Arellano
Copy link
Contributor Author

Merging with the understanding that there will be followup work to improve this all (which has started!).

@Eric-Arellano Eric-Arellano merged commit 12a5d7b into pantsbuild:master May 14, 2019
@Eric-Arellano Eric-Arellano deleted the v2-pytest-absolute-imports branch May 14, 2019 18:17
Eric-Arellano added a commit that referenced this pull request May 16, 2019
### Problem
Python requires packages to have an `__init__.py` file to be recognized as a proper module for the sake of imports.

#7696 does this for Pytest, but inlines the logic, even though it will likely be helpful for other Python rules as well.

Further, because this logic was originally written before being able to from `Digest->Snapshot` thanks to #7725, we had to use `FilesContent` to grab the paths of the digest. This would mean that every single source file would be materialized and persisted to memory, resulting in extremely high memory usage (found as part of investigation in #7741). There is no need for the actual content, just the paths, so this is a huge inefficiency.

Will close #7715.

### Solution
Generalize into `@rule(InitInjectedDigest, [Snapshot])`, where `InitInjectedDigest` is a thin wrapper around a `Digest`.

We take a `Snapshot` because we need the file paths to work properly. This contrasts with earlier using `FileContents` to get the same paths. A `Snapshot` is much more light weight.

We return a `Digest` because that is the minimum information necessary to work properly, and the caller of the rule can then convert that `Digest` back into a `Snapshot`.

### Result
It will now be easier for other Python rules to work with Python packages.

The unnecessary memory usage is now fixed. The V2 Pytest runner now has a space complexity of `O(t + t*e + b)`, rather than `O(t + t*e + s)`, where `t` is # targets, `e` is # env vars, `b` is # `BUILD` files, and `s` is # source files.
stuhood pushed a commit that referenced this pull request Jul 24, 2019
### Problem
While #7696 added support for source roots, its implementation of stripping the actual prefix from the file broke use of resources, specifically when referring to their path in source code, e.g. calling `open('tests/python/pants_test/backend/jvm/tasks/ivy_utils.py')`. This is because we physically stripped the name, so the hardcoded file path would need to be updated to `'pants_test/backend/jvm/tasks/ivy_utils.py'`. Per #8060 (comment), this is not acceptable because it changes a public API.

We must have a way to relativize Python files so that their import statements work, but to also keep the full path.

### Solution
By populating the `PYTHONPATH` env var for the Pex that runs Pytest with the unique source root paths, Python absolute imports work without actually having to strip any prefix. This appears similar to what [Pycharm and IntelliJ do](https://stackoverflow.com/questions/21199382/why-does-pycharm-always-add-contents-root-to-my-pythonpath-and-what-pythonpat).

#### Alternative attempted: use `PEX`'s `--source-directory`
PEX is able to understand source roots via its `--source-directory` and `--resource-directory` args. The idea was to create two pexes: a requirements PEX and a sources PEX, then to merge the two via `PEX_PATH`. They would stay separate to increase the likelihood of cache hits.

This solution failed, however, due to Pytest failing to discover any tests. See #8063 (comment) and the comment below it.

#### Alternative attempted: only strip `PythonTarget`s
The first idea was to only strip any subclass of `PythonTarget` and keep the full path for other target types like `files` (2b68ebd). This failed quickly, though, because the `VERSION` file is not able to correctly imported in this case.

### Result
Two tests can be removed from the V2 blacklist without any changes to them required.
Eric-Arellano added a commit that referenced this pull request Aug 25, 2019
### Problem
#7696 introduced support for source roots to V2 Pytest for the first time, but it did not support loose files. #8063 tried to fix this by no longer stripping the source root from source files. However, this caused two issues:

1) Regression that `repr()` now shows the full path, not the relativized path: #8063 (comment)
2) Namespace packages, such as contrib code, do not work with V2. A namespace package is where multiple folders have the same package name, like `pants`, but may not within the same actual folder.
   * Contrib unit tests would fail because `PYTHONPATH` had two entries referring to `pants`: `src/python/pants` and `contrib/mypy/src/python/pants`. Python would use whichever entry came first and ignore the other, which does not work. Instead, we want those two to merge into one namespace package.
 
### Solution
Follow the V1 Pytest approach of stripping the source root, like we used to, but only for source files. Loose files (i.e. `files()`) are not stripped so that the file system APIs work as expected. This approach comes from:

https://github.com/pantsbuild/pants/blob/d048fd2e8f34adc32fdce36a51a765fbf6067cff/src/python/pants/backend/python/subsystems/pex_build_util.py#L101-L110

### Result
V2 Pytest now supports both source roots and loose files! This allows us to run most contrib unit tests with V2 and unblocks #8113.

Because over 99% of unit tests are now remoted, we go back to only one CI shard for unit tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants