-
-
Notifications
You must be signed in to change notification settings - Fork 644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[internal] Improve Go performance by running fewer processes to analyze external packages #13128
[internal] Improve Go performance by running fewer processes to analyze external packages #13128
Conversation
f81417f
to
0c7ec11
Compare
…fewer processes # Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
0c7ec11
to
1583e7e
Compare
Does this rely on |
No, even if you don't use pantsd the engine will memoize for the current run. See the benchmarks I just posted without Pantsd :D |
# Rust tests and lints will be skipped. Delete if not intended. [ci skip-rust] # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]
What about subsequent runs? Would this get cached remotely in a CAS? |
The This is the same with target generation. The underlying |
import_path_to_info = {} | ||
for metadata in ijson.items(json_result.stdout, "", multiple_values=True): | ||
import_path = metadata["ImportPath"] | ||
pkg_info = ExternalPkgInfo( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ExternalPkgInfo
is going to need HFiles
for assembly which I did not implement including in the assembly step. It would be prudent to also check if any of the other source keys are set (e.g., CgoFiles
) and error if they are, so we don't try an incomplete build of an external package. This can be addressd as a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I wanted to leave off anything that we don't need until we actually consume it: keep things as simple as possible for now.
I'm fine with eagerly erroring on unsupported files though.
We were running
go list -json
much more frequently than necessary. We run it once per module to generate_go_external_package
targets, but then we would rerun it on each package within the module. Those each often took 5-9 seconds, so this was atrocious for performance. Instead, we can simply run it once per module and leverage the engine's memoization to avoid re-running.This refactors our external module code so that it no longer interacts at all with the Target API, which reduces coupling. We rename things like the file to
external_pkg.py
so that only theExternalModuleInfo
andExternalPkgInfo
types are exposed. Notably, we no longer useResolvedGoPackage
for external code, which allows us to ignore lots of irrelevant fields liketest_imports
.dependencies
benchmarkThis requires changing this line to use
Targets
so that generated targets have their dependencies resolved too:pants/src/python/pants/backend/project_info/dependencies.py
Line 94 in a744978
Before (I aborted after 1 run because it was so slow):
After:
package
benchmarkCompilation is faster now too, as we no longer need to run
go list
to determine the.go
and.s
files. It's already eagerly computed beforehand.Before:
After:
[ci skip-rust]
[ci skip-build-wheels]