Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consume CoarsenedTargets (added in #12251) in javac for cycle resolution, including cycles created by subtargets. #12231

Merged
merged 2 commits into from
Aug 21, 2021

Conversation

patricklaw
Copy link
Member

@patricklaw patricklaw commented Jun 20, 2021

Consume CoarsenedTargets (added in #12251) in javac for cycle resolution, including cycles created by subtargets.

NOTE: For initial CI and review, this commit comments out the pytest skip line in javac_test.py that otherwise prevents #12293 from occuring. Before merging this PR, I will either uncomment that line again, or have already merged a temporary half-fix for #12293 that at least lets us re-enable some tests (but with caveats).

[ci skip-rust]
[ci skip-build-wheels]

@patricklaw patricklaw added lang-java backend: JVM JVM backend-related issues labels Jun 20, 2021
@patricklaw patricklaw requested a review from Eric-Arellano June 20, 2021 21:11
@stuhood
Copy link
Member

stuhood commented Jun 21, 2021

As mentioned in slack, this is "file-level targets" aka "subtargets" (most heavily discussed in #10455).

Our expectation for compiled languages is that they would coarsen the dependencies as much as necessary to invoke a compiler: i.e., batch all files involved in a cycle into one invoke of the compiler. This would mean a M:N relationship between targets/files and compiler invokes.

Concretely, I think that this might mean finding or implementing an algorithm that will split the graph into strongly connected components, and then invoking the compiler once per component. We already import such a thing on the rust side via petgraph: scc, so if there isn't a lightweight implementation in Python, could expose it via ffi.

@stuhood stuhood requested a review from tdyas June 21, 2021 16:22
@tdyas
Copy link
Contributor

tdyas commented Jun 21, 2021

This would mean a M:N relationship between targets/files and compiler invokes.

Rust and Go support will definitely need to do that since Rust builds each crate at once and Go builds each import path (a/k/a Go package) into the applicable binary artifact.

@patricklaw patricklaw requested review from stuhood and removed request for Eric-Arellano July 25, 2021 21:04
@patricklaw patricklaw changed the title javac: Request ExplicitlyProvidedDependencies to break Subtarget dep cycle Consume CoarsenedTargets (added in #12251) in javac for cycle resolution, including cycles created by subtargets. Jul 25, 2021
Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you.

My "hm" comment is not a blocker, but it's likely that adjusting your recursion slightly will make it unnecessary to manually expand targets.

Comment on lines +74 to 84
# Target coarsening currently doesn't perform dep expansion, which matters for targets
# with multiple sources that expand to individual source subtargets.
# We expand the dependencies explicitly here before coarsening, but ideally this could
# be done somehow during coarsening.
# TODO: Should component dependencies be filtered out here if they were only brought in by component members which were
# filtered out above (due to having no JavaSources to contribute)? If so, that will likely required extending
# the CoarsenedTargets API to include more complete dependency information, or to support such filtering directly.
expanded_direct_deps = await Get(Targets, Addresses(request.component.dependencies))
coarsened_direct_deps = await Get(
CoarsenedTargets, Addresses(t.address for t in expanded_direct_deps)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. I don't think that this should be necessary.

How I had expected that this would work was that if you received a compilation request for a base target (i.e., one with no sources), then you would coarsen it and recursively request compilation of its dependencies, which would be the expanded/file level targets... which would actually have sources (and so on).

Currently it looks like you're bailing early if a target has no sources, but I would expect the recursion to continue: if you're compiling recursively, some nodes in the recursion not having sources shouldn't short-circuit the recursion.

Copy link
Member

@stuhood stuhood Jul 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can see how the graph ends up being complete in the unit tests: the CoarsenedTargets for the base targets (dep, t1, and t2) depend on their own files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I had a fair amount of trouble deciding on an overall mental model here. Consuming a CoarsenedTarget in CompileJavaSourceRequest was where I landed. I also waffled for quite a while on spliting this into two rules: one for dealing with expanding and coarsening targets, and one lower level rule for actually running the compilation against a set of JavaSources fields.

I'm open to changing this, but let me explain my thought process:

if you're compiling recursively, some nodes in the recursion not having sources shouldn't short-circuit the recursion.

The whole crux of the problem with javac is limiting my dependency classpath to only the current compilation unit's direct dependencies. We don't want the full transitive closure on the classpath of the current compilation unit, by design, even if we might indirectly, recursively build it in order to build our direct dependencies. So we're really distinguishing here between our compile-time (direct dep) classpath, and a "runtime" classpath that is likely the closure's classpath. This is why the recursion eagerly terminates when it concludes that the current component doesn't produce any classpath entries: we don't need to build it, which means we don't need its deps (only used to build it), which means we don't recursively compile its deps.

But by direct dependency, we really mean direct source dependency, or possibly some other kind of resource dependency that can somehow contribute to the classpath but isn't yet modeled. This is where things get sticky, because target expansion sort of changes the definition of "direct" dependency.

You can see how the graph ends up being complete in the unit tests: the CoarsenedTargets for the base targets (dep, t1, and t2) depend on their own files.

That test only uses explicit file dependencies; it doesn't cover the case I'm covering here where, e.g. t1 depends on dep instead of dep.txt:dep.

An example:

A.java -> B -- (expands to) --> (B.java and B2.java).

If we use target expansion, we'll automatically get A.java -> B.java; A.java -> C.java; and B.java <--> C.java. So subtarget expansion ends up getting us our natural notion of direct dependencies. Just coarsening B will not do this; we need to do both. See javac_test.py:test_compile_with_transitive_multiple_sources in this PR for a concrete example of this.

Stepping way back: my mental model here was to operate over the nodes of an induced coarsened, expanded graph. Then each rule invocation maps 1:1 with a coarsened, expanded component, and its only job is to make sure that its recursive calls are also over coarsened, expanded components. Expanding gets us the right notion of direct dependencies in the presence of subtarget expansion, while coarsening gets us the right notion of a single compilation unit in the presence of source-target cycles.

I'm pretty confident this needs a more generic model, since we're going to need to eventually worry about other rules that can contribute to the compilation classpath, which means we're probably going to need to request a union product from our dependencies. That's where I was imagining two layers of rules, where compile_java_source is lower level and not expected to think hard about how to expand its dependencies. But without another example of a rule that produces files for the compilation classpath, I worry that attempting to model that is premature. So I opted for the "direct" route here just to get javac working and exercising the existing coarsening/expansion machinery.

Again, I'm happy to change this if you have an idea of how to remodel it, but pay particular attention to the example in javac_test.py:test_compile_with_transitive_multiple_sources to make sure that that case is covered (or why that case is somehow not actually a valid graph to test against).

Comment on lines +83 to +91
def expect_single_expanded_coarsened_target(
rule_runner: RuleRunner, address: Address
) -> CoarsenedTarget:
expanded_target = rule_runner.request(Targets, [Addresses([address])]).expect_single()
coarsened_targets = rule_runner.request(
CoarsenedTargets, [Addresses([expanded_target.address])]
)
assert len(coarsened_targets) == 1
return coarsened_targets[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to test the CoarsenedTarget API rather than javac... can probably remove it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't a test, it's a helper for a commonly reused pattern in the actual tests. I have to do this expansion every time because of the interface of compile_java_source. I feel like the assertion is valuable even just for readability, in the same spirit as Targets.expect_single().

[ci skip-rust]

[ci skip-build-wheels]
@patricklaw patricklaw requested a review from stuhood August 21, 2021 21:24
@patricklaw
Copy link
Member Author

I'm going to go ahead and submit since the comments were nonblocking, and I have other PRs that are currently branched off of this one. I'm happy to continue the conversation here and follow up with any changes in another PR.

@patricklaw patricklaw enabled auto-merge (squash) August 21, 2021 21:41
@patricklaw patricklaw merged commit 3df7c00 into pantsbuild:main Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: JVM JVM backend-related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants