Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port cloc goal to v2 engine #8251

Merged
merged 1 commit into from
Nov 13, 2019
Merged

Conversation

gshuflin
Copy link
Contributor

@gshuflin gshuflin commented Sep 6, 2019

Problem

The cloc goal was using v1 engine abstractions rather than v2 engine abstractions

Solution

Rewrite the goal to use the v2 engine abstractions and delete the v1 engine code. After this commit, the ClocBinary subclass of BinaryToolBase is also removed since nothing is using it anymore.

Result

Nothing about the cloc goal should change from the user's perspective

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! All my suggestions are nits. Thank you for working on this!

Eric-Arellano added a commit that referenced this pull request Sep 7, 2019
…ctor (#8255)

#8251 (comment) showed that `pathlib.Path` can take multiple args in the constructor, like `os.path.join()`, for a more ergonomic API that allows us to avoid using `/` to join two Paths.

This also replaces `os.path` with `pathlib.Path` (and adds f-strings) to some files.
@gshuflin gshuflin marked this pull request as ready for review September 9, 2019 22:21
@gshuflin
Copy link
Contributor Author

gshuflin commented Sep 9, 2019

@stuhood @jsirois

Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only blocker is input_files.txt having non-deterministic content due to the set input being ordered by a graph walk. That graph walk could change over time as dependencies change but still produce the same set of files in a different order. Otherwise lgtm!

Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll try to keep an eye on CI and merge on green.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! Thank you!

def alias_groups(cls):
return BuildFileAliases(targets={'java_library': JavaLibrary})

def assert_counts(self, result: Sequence[str], lang: str, num_files: int, blank: int, comment: int, code: int) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only an FYI, not gonna make you do this. You've pushed enough changes. One of my favorite features in Python is requiring kwarg only by specifying *:

Suggested change
def assert_counts(self, result: Sequence[str], lang: str, num_files: int, blank: int, comment: int, code: int) -> None:
def assert_counts(self, result: Sequence[str], lang: str, *, num_files: int, blank: int, comment: int, code: int) -> None:

This means that anything after the star must use a keyword argument. You end up using kwargs anyways, but this would mean that the language itself enforces it for you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a neat feature I didn't know about and I'll keep it in mind, but since this is test support code that we don't expect to use outside this test I think it's okay to leave this out.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One blocker due to sandbox leakage.

sources = getattr(t, 'sources', None)
if sources is not None:
for f in sources.snapshot.files:
source_paths.add(str(Path(build_root, f)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are absolute, which means that the rule will escape its sandbox. Instead, they should be relative, so that the contents of the snapshot actually get used.

transitive = options.values.transitive
ignored = options.values.ignored

if (transitive):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for parens here.


file_outputs = {fc.path: fc.content.decode() for fc in files_content.dependencies}

with CountLinesOfCode.line_oriented(options, console) as (print_stdout, print_stderr):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goal isn't line oriented I don't think: the mixin adds things that would break the output of cloc (separator character).

I think you want to skip extending LineOriented, and just use the methods of Console directly.



@rule(DownloadedClocScript, [])
def download_cloc_script():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By hardcoding this, folks won't be able to change the version any longer.

I know that this isn't the first place that we've done this, but would you mind creating/referencing a ticket about adding support to the existing Subsystems for specifying per-platform digests for things, so that our existing options (--version, --binary-base-url, etc) can be used?

Copy link
Contributor

@jsirois jsirois left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More to come...

from pants.util.objects import datatype


class DownloadedClocScript(datatype([('script_path', str), ('digest', Digest)])):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: users do not care how the ClocScript is retrieved, just that it's the cloc script. Consider dropping the Dowloaded prefix here. In the rule function name, in contrast, I think the prefix makes perfect sense.

digest = yield Get(Digest, DirectoriesToMerge(directories=tuple(digests_to_merge)))

cmd = (
'/usr/bin/perl',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves a comment calling the hack out. It's a hack we also rely on to execute pexes, although we leverage PATH there in the form of the PythonSetup --interpreter-search-paths option:

pants/pants.remote.ini

Lines 30 to 40 in 455190b

[python-setup]
# TODO(#7735): This config is not ideal, that we must specify the PATH for both local and remote
# platforms. This should be replaced by a proper mechanism to differentiate between the two.
interpreter_search_paths: [
# We include the host PATH and PEXRC values so that speculation still works.
'<PATH>',
'<PEXRC>',
# This is the $PATH of the docker container, obtained by locally running `$ docker run --tag
# rbe-remote-execution sh -c 'echo $PATH'`.
"/pyenv-docker-build/versions/3.7.3/bin:/pyenv-docker-build/versions/3.6.8/bin:/pyenv-docker-build/versions/2.7.15/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/go/bin",
]

Consider filing an issue to track either:

  1. Handling bootstrap of implicit binary requirements like these via BinaryUtil, or the more deep bootstrapping of a c compiler toolchain all the way through building perl all via rules.
  2. Finding a way to express / impose these requirements on users so they get a better / earlier error message that the environment hosting the EPR doesn't have the required binary.


digests_to_merge = []

build_root = get_buildroot()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused (thankfully!).

#TODO(#8459) - It would be nice if we weren't hardcoding a specific version of the cloc binary here.
@rule
def download_cloc_script() -> DownloadedClocScript:
url = "https://binaries.pantsbuild.org/bin/cloc/1.80/cloc"
Copy link
Member

@stuhood stuhood Oct 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, hm. The URL actually needs to be constructed using BinaryUtil.Factory, because although it's fine to hardcode the version temporasrily, we can't really hardcode the URL (users use these tools from behind proxies).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stuhood since this is not trivial to do (and since we already have an open ticket #7790 for this), in the interest of getting this PR merged I've removed the code that gets rid of the v1 version of cloc and renamed the v2 version fast-cloc. Since we're no longer dropping support for the v1 cloc, I think this should be okay to merge, and then fix the hardcoding in a subsequent commit.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

@gshuflin gshuflin force-pushed the new-v2-cloc branch 5 times, most recently from 9236da1 to 0591eb4 Compare November 13, 2019 00:33
#TODO(#8459) - It would be nice if we weren't hardcoding a specific version of the cloc binary here.
@rule
def download_cloc_script() -> DownloadedClocScript:
url = "https://binaries.pantsbuild.org/bin/cloc/1.80/cloc"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep.

Comment on lines +49 to +51
register('--transitive', type=bool, fingerprint=True, default=True,
help='Operate on the transitive dependencies of the specified targets. '
'Unset to operate only on the specified targets.')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cosmicexplorer @stuhood is this okay to still have given #8224?

@gshuflin gshuflin merged commit 58251ef into pantsbuild:master Nov 13, 2019
@gshuflin gshuflin deleted the new-v2-cloc branch November 13, 2019 23:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants