Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stop roundtrip snapshotting in rsc compiles #7838

Conversation

cosmicexplorer
Copy link
Contributor

Problem

The RscCompile task currently will drop the output_directory_digest of any hermetic compiles it performs, instead relying on materializing output files to .pants.d, then re-snapshotting them. This is slow, and the slowdown (possibly) scales quadratically with the depth of the dependency graph.

Solution

  • Make ctx.rsc_jar_file (for RscCompileContext) into a ClasspathEntry instead of just a path.
  • Return the output_directory_digest from self._runtool (if performed hermetically, and write it to ctx.rsc_jar_file.

Result

The rsc compile task is probably faster!

@@ -399,14 +383,17 @@ def nonhermetic_digest_classpath():
'-d', rsc_jar_file,
] + target_sources

self._runtool(
directory_digest = self._runtool(
args,
distribution,
tgt=tgt,
input_files=tuple(compile_classpath_rel),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than passing the classpath as input_files, it should now be possible to pass it as classpath_entries and assume digests like zinc does. That will allow for removing the input_files arg post #7837.

@@ -576,6 +563,24 @@ def create_compile_context(self, target, target_workdir):
sources=sources,
))

# TODO: This overrides the same method in jvm_compile.py, and that method should be refactored to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible that rather than overriding this method, you should consider modifying self._set_directory_digests_for_valid_target_classpath_directories instead, to support populating digests for all contexts...?

if cp.directory_digest is not None:
return cp

cp_digest = Digest.load(os.path.dirname(cp.path))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's important to always actually call capture_snapshots, because it's possible that the Store no longer holds the Digest, in which case we need to re-snapshot to capture it.

So recommend to call like:

cp_globs = PathGlobsAndRoot(
  PathGlobs((fast_relpath(cp.path, get_buildroot()),)),
  text_type(get_buildroot(),
  Digest.load(os.path.dirname(cp.path)),
))
cp_snapshot, = self.context._scheduler.capture_snapshots((cp_globs,))
cp_digest = cp_snapshot.directory_digest

Also, the Digest needs to be stored explicitly:

cp_digest.dump(..)

See

snapshots = self.context._scheduler.capture_snapshots(
tuple(
PathGlobsAndRoot(
PathGlobs([jar]),
get_buildroot(),
Digest.load(jar),
) for jar in jar_paths
))
for snapshot, jar_path in zip(snapshots, jar_paths):
snapshot.directory_digest.dump(jar_path)

@illicitonion
Copy link
Contributor

Obsoleted by #7858 and #7861 :) Thanks for starting this off!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants