-
-
Notifications
You must be signed in to change notification settings - Fork 652
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for source roots in V2 ./pants test
#7696
Add support for source roots in V2 ./pants test
#7696
Conversation
./pants test
./pants test
commit 0a38b338ba5c3d9dd0bc080fa477e8b74e9850e6 Merge: c0bfc9e 9ca32a9 Author: Eric Arellano <[email protected]> Date: Fri May 10 11:20:51 2019 -0700 Merge branch 'dwagnerhall/strip-prefix' of https://github.com/twitter/pants into twitter-dwagnerhall/strip-prefix commit c0bfc9e Author: Eric Arellano <[email protected]> Date: Fri May 10 10:49:18 2019 -0700 Output stderr in V2 test rule (pantsbuild#7694) ### Problem Implements pantsbuild#7388. Swallowing stderr makes debugging very difficult. Important clarification: failing tests will write to stdout, meaning that we do already output any message captured by Pytest. Instead, this PR is meant to log issues that occur before Pytest executes, such as issues from Pants. ### Solution Follow the convention established by pantsbuild#7676 to output `{address} stderr:`, followed by the stderr. commit 55e5721 Author: Daniel Wagner-Hall <[email protected]> Date: Fri May 10 17:32:03 2019 +0100 Remove now-unused Path type (pantsbuild#7701) The engine lost Paths from its Snapshots at some point, and we didn't clean up. commit 9ca32a9 Author: Daniel Wagner-Hall <[email protected]> Date: Fri May 10 10:59:41 2019 +0100 Allow Directories to be un-nested commit 5eed2e7 Author: Eric Arellano <[email protected]> Date: Thu May 9 17:38:03 2019 -0700 Mark build-support Python files as Pants targets to lint build-support (pantsbuild#7633) ### Converting to Pants targets Now that we have two Python scripts, and may have more scripts in the future, there is value in linting our script code. By making these two scripts as targets, we can now use `./pants fmt`, `./pants lint`, and `./pants mypy` on the build-support folder for little cost. The pre-commit check will check `./pants fmt` and `./pants lint` automatically on the `build-support` Python files now. It will not do the header check until we explicitly add the folder. ### Add `common.py` We also add `common.py` to reduce duplication and simplify writing scripts. See the `NB` in that file about why we only use the stdlib for that file.
The Pex looks correct now... But it still does not pass!
We never include it in actual BUILD files. This way we test that the __init__.py injection actually works.
./pants test
./pants test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not add any value to init.py, unlike...to keep things simple for now. This might need to be changed down the road.
When might it need to be changed? Could we have an issue link?
This does mean that Pytest's output will be changed
This sounds like it means ./pants test
output is going to be changed for everyone, but I think this just means for the v2 runner right? Maybe I'm picking on words, but since this is in the commit message, I might say something about being at feature parity vs v1 pytest to be more clear.
|
||
|
||
testprojects/tests/python/pants/dummies:target_with_source_dep_relative_import ..... SUCCESS | ||
""") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally prefer keeping parentheses joined here instead of spreading them over a new line, but I think the style is inconsistent all over the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. For now, this is using the file convention which I'm going to stick with.
This is precisely why I want to add Black to our codebase - either style works just fine, but the issue is that this statement is unfortunately true :/
I think the style is inconsistent all over the repo.
output_files=injected_inits, | ||
description="Inject empty __init__.py into all packages without one already.", | ||
input_files=sources_digest, | ||
env={"PATH": os.environ["PATH"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would really, really like this to be taken out either into its own v2 rule with e.g. a @rule(FallibleExecuteProcessResult, [LocalPATHExecuteProcessRequest])
which wraps a normal ExecuteProcessRequest
, does some transformation (setting env={"PATH": os.environ["PATH"]}
), then does a yield Get(FallIbleExecuteProcessResult, ExecuteProcessRequest, modified_exe_request)
. I would like this to be done in this PR if possible, as I would prefer to avoid having references to the local PATH start sprouting unchecked (and I would love to be able to modify just a single @rule
in order to modify the behavior of all rules which use anything from the host PATH!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to do this in a followup, but request not blocking on this here. 3 V2 Pytest PRs are blocked by this one landing, which block any progress on the remoting project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not add any value to init.py, unlike...to keep things simple for now. This might need to be changed down the road.
When might it need to be changed? Could we have an issue link?
For now, it does not appear we will ever need to add it. Everything works just fine with empty __init__.py
and the default in Python is to use an empty file. I only mentioned the divergence so that someone in the future knows this could potentially be a source of issues, even though there is no evidence at the moment that it is.
This does mean that Pytest's output will be changed
This sounds like it means ./pants test output is going to be changed for everyone, but I think this just means for the v2 runner right? Maybe I'm picking on words, but since this is in the commit message, I might say something about being at feature parity vs v1 pytest to be more clear.
My bad - reworded to explain that this impact is only for V2.
--
Thanks for the review!
|
||
|
||
testprojects/tests/python/pants/dummies:target_with_source_dep_relative_import ..... SUCCESS | ||
""") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. For now, this is using the file convention which I'm going to stick with.
This is precisely why I want to add Black to our codebase - either style works just fine, but the issue is that this statement is unfortunately true :/
I think the style is inconsistent all over the repo.
output_files=injected_inits, | ||
description="Inject empty __init__.py into all packages without one already.", | ||
input_files=sources_digest, | ||
env={"PATH": os.environ["PATH"]} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to do this in a followup, but request not blocking on this here. 3 V2 Pytest PRs are blocked by this one landing, which block any progress on the remoting project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good :) Thanks!
Digest, MergedDirectories(directories=tuple(all_sources_digests)), | ||
) | ||
|
||
# TODO(7716): add a builtin rule to go from MergedDirectories->Snapshot or Digest->Snapshot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh! I'm pretty sure we actually can do this now! (There were reasons we couldn't before, but I think they're all gone).
I'd be very happy to talk you through how to do this - it would look very similar to #7699 but instead of calling a new fs::Snapshot::strip_prefix
function it would call the existing fs::Snapshot::from_digest
function.
Definitely doesn't need to block this PR, but would be good to do soon :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yay! I've always wanted to write Rust, so with a little guidance would be happy to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple distinct followups I'd like to do from this PR:
- Generalize
__init__.py
creation. Add V2 rule to inject__init__.py
into directory digests #7715 - Add builtin rule
Digest -> Snapshot
and/orMergedDirectories->Snapshot
. Provide in V2 a builtin rule to go from Digest -> Snapshot #7716 - Add builtin rule
FilesContent -> Snapshot
so that we can stop usingtouch
. Add a V2 builtin rule to go from FilesContent -> Snapshot #7718 - Generalize prefix stripping into a rule. Consider how to relativize target Snapshots to source roots #7697
I'd like to do these in a followup if possible, because each is a fairly substantial change and I want them to get dedicated reviews and have the space / time to properly iterate on things like testing each of them. This PR is already pretty substantial in scope + blocks several others, so I would prefer to land as is and do followups.
Digest, MergedDirectories(directories=tuple(all_sources_digests)), | ||
) | ||
|
||
# TODO(7716): add a builtin rule to go from MergedDirectories->Snapshot or Digest->Snapshot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yay! I've always wanted to write Rust, so with a little guidance would be happy to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome :) I'm happy for all the follow-ups to be follow-ups!
So close to killing Py2..
Merging with the understanding that there will be followup work to improve this all (which has started!). |
### Problem Python requires packages to have an `__init__.py` file to be recognized as a proper module for the sake of imports. #7696 does this for Pytest, but inlines the logic, even though it will likely be helpful for other Python rules as well. Further, because this logic was originally written before being able to from `Digest->Snapshot` thanks to #7725, we had to use `FilesContent` to grab the paths of the digest. This would mean that every single source file would be materialized and persisted to memory, resulting in extremely high memory usage (found as part of investigation in #7741). There is no need for the actual content, just the paths, so this is a huge inefficiency. Will close #7715. ### Solution Generalize into `@rule(InitInjectedDigest, [Snapshot])`, where `InitInjectedDigest` is a thin wrapper around a `Digest`. We take a `Snapshot` because we need the file paths to work properly. This contrasts with earlier using `FileContents` to get the same paths. A `Snapshot` is much more light weight. We return a `Digest` because that is the minimum information necessary to work properly, and the caller of the rule can then convert that `Digest` back into a `Snapshot`. ### Result It will now be easier for other Python rules to work with Python packages. The unnecessary memory usage is now fixed. The V2 Pytest runner now has a space complexity of `O(t + t*e + b)`, rather than `O(t + t*e + s)`, where `t` is # targets, `e` is # env vars, `b` is # `BUILD` files, and `s` is # source files.
### Problem While #7696 added support for source roots, its implementation of stripping the actual prefix from the file broke use of resources, specifically when referring to their path in source code, e.g. calling `open('tests/python/pants_test/backend/jvm/tasks/ivy_utils.py')`. This is because we physically stripped the name, so the hardcoded file path would need to be updated to `'pants_test/backend/jvm/tasks/ivy_utils.py'`. Per #8060 (comment), this is not acceptable because it changes a public API. We must have a way to relativize Python files so that their import statements work, but to also keep the full path. ### Solution By populating the `PYTHONPATH` env var for the Pex that runs Pytest with the unique source root paths, Python absolute imports work without actually having to strip any prefix. This appears similar to what [Pycharm and IntelliJ do](https://stackoverflow.com/questions/21199382/why-does-pycharm-always-add-contents-root-to-my-pythonpath-and-what-pythonpat). #### Alternative attempted: use `PEX`'s `--source-directory` PEX is able to understand source roots via its `--source-directory` and `--resource-directory` args. The idea was to create two pexes: a requirements PEX and a sources PEX, then to merge the two via `PEX_PATH`. They would stay separate to increase the likelihood of cache hits. This solution failed, however, due to Pytest failing to discover any tests. See #8063 (comment) and the comment below it. #### Alternative attempted: only strip `PythonTarget`s The first idea was to only strip any subclass of `PythonTarget` and keep the full path for other target types like `files` (2b68ebd). This failed quickly, though, because the `VERSION` file is not able to correctly imported in this case. ### Result Two tests can be removed from the V2 blacklist without any changes to them required.
### Problem #7696 introduced support for source roots to V2 Pytest for the first time, but it did not support loose files. #8063 tried to fix this by no longer stripping the source root from source files. However, this caused two issues: 1) Regression that `repr()` now shows the full path, not the relativized path: #8063 (comment) 2) Namespace packages, such as contrib code, do not work with V2. A namespace package is where multiple folders have the same package name, like `pants`, but may not within the same actual folder. * Contrib unit tests would fail because `PYTHONPATH` had two entries referring to `pants`: `src/python/pants` and `contrib/mypy/src/python/pants`. Python would use whichever entry came first and ignore the other, which does not work. Instead, we want those two to merge into one namespace package. ### Solution Follow the V1 Pytest approach of stripping the source root, like we used to, but only for source files. Loose files (i.e. `files()`) are not stripped so that the file system APIs work as expected. This approach comes from: https://github.com/pantsbuild/pants/blob/d048fd2e8f34adc32fdce36a51a765fbf6067cff/src/python/pants/backend/python/subsystems/pex_build_util.py#L101-L110 ### Result V2 Pytest now supports both source roots and loose files! This allows us to run most contrib unit tests with V2 and unblocks #8113. Because over 99% of unit tests are now remoted, we go back to only one CI shard for unit tests.
Problem
Running tests over our codebase, e.g.
./pants --no-v1 --v2 test tests/python/pants_test/util:strutil
, will fail because it won't be able to resolve the modulepants
.The issue is that we are not properly handling source roots: https://www.pantsbuild.org/setup_repo.html#source-roots.
Solution
DirectoryWithPrefixToStrip
builtin Rust rule from Allow Directories to be un-nested #7699 to strip the sources' prefixes before getting passed to Pytest.SourceRootConfig
subsystem to dynamically determine that file's source root prefix.__init__.py
files for each package.__init__.py
, which is standard to use an empty__init__.py
in Python. However, this contrasts with some V1 code likepants/src/python/pants/backend/python/subsystems/pex_build_util.py
Lines 325 to 330 in 55529b0
Result
./pants --no-v1 --v2 test tests/python/pants_test/util:strutil
now passes, along with the new integration test.This does mean that Pytest's
output
will be changed when using V2 in that it now only outputs the stripped paths. For now, this is okay because the V2 test runner will still print the whole path in the surrounding log and it is not worth the effort to re-inject the full path in Pytest's output.