Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add black support as a v2 rule #8350

Conversation

pierrechevalier83
Copy link
Contributor

@pierrechevalier83 pierrechevalier83 commented Sep 27, 2019

Problem

We want to provide a way for pants to run black.

Solution

Add a v2 rule for "fmt_v2" (temporarily, to workaround conflict with the v1 "fmt" rule).
Add a run_black rule which runs black on each target.
Register Black's config file as a configurable option.
Add a black config that works for us (line length 100 lines which is aligned with our current style,
exclude the file in "compilation_failure" as it is intentionally invalid and trips black up).

Result

It is now possible to run ./pants fmt_v2 :: to format all python files in the repository (or any other target, as expected).

Note that this isn't the full picture, but it is probably good enough to stand as its own PR.
Things we may want to improve in the future:

  • Make it part of ./pants fmt (and skip the python formatting by default to avoid surprising our users)
  • Minimise the calls to Black by batching all the files form all targets being run and calling it only once
    This will yield speed improvement and a less verbose command line output.
  • Add it as part of our pre_commit hook.
  • Format our code with it.

@pierrechevalier83
Copy link
Contributor Author

Note: this is one of the prerequisites for #8334

@pierrechevalier83 pierrechevalier83 force-pushed the pchevalier/pre_black/black_support_v2 branch from fe1dd02 to e9e1eef Compare September 27, 2019 13:32
Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a few integration tests:

  • Multiple incorrectly formatted files, in different directories, but with at least two files in one directory
  • A correctly formatted file
  • No config file
  • A custom config file

I suspect that the latter two could be unit tests, but it may be kind of fiddly to set them up :)

line-length = 100
exclude = '''
/(
| \.git
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a pants perspective, pants already ignores these files when it expands globs; are they here in order to preserve the ability to just run black not via pants? If so, may be worth a comment to that effect

from pants.util.objects import datatype


class PythonFormatable(datatype(['target'])):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe FormattablePythonTarget?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please add a link to #8343 explaining that this is a horrible hack, and why it's there :)

RequirementsPex, RequirementsPexRequest(
output_filename="black.pex",
requirements=tuple(black.get_requirement_specs()),
interpreter_constraints=(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PythonToolBase by default has a flag for this, so we should probably pass it through from the Black instance.

Also, IIRC black has a minimum python version of 3.6, so we probably want Black to override default_interpreter_constraints to specify that minimum.

Copy link
Contributor Author

@pierrechevalier83 pierrechevalier83 Sep 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, IIRC black has a minimum python version of 3.6, so we probably want Black to override default_interpreter_constraints to specify that minimum.

By default, it picks the flag based on the file itself, so I don't think that's necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow... What would stop this pex from trying to run with py27 if that's the default python on your system?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry: misread your comment. I thought you were speaking about Black's option of specifying the version of the file that it's formatting. Addressed now.

entry_point=black.get_entry_point(),
)
)
target = target.target
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe call the parameter wrapped_target, so this line reads target = wrapped_target.target?

# The exclude option from Black only works on recursive invocations,
# so call black with the directories in which the files are present
# and passing the full file names with the include option
dirs = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be a set, rather than a list; multiple files may be in the same dir

(and then you'll want tuple(sorted(dirs)) when you consume them)

class Fmt(Goal):
"""Autoformat source code."""

name = 'fmt_v2'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO linking to #8351


@console_rule
def fmt(console: Console, targets: HydratedTargets) -> Fmt:
results = yield [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO linking to #4535 saying something like "@union assumes that all targets passed implement the union, so we manually filter the targets we know do; this should probably no-op or log or something configurable for non-matching targets"

results = yield [
Get(FmtResult, FmtTarget, target.adaptor)
for target in targets
if isinstance(target.adaptor, (PythonAppAdaptor, PythonTargetAdaptor, PythonTestsAdaptor, PythonBinaryAdaptor)) and hasattr(target.adaptor, "sources")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO linking to #4535 complaining about the hasattr check we need to do here

files_content = yield Get(FilesContent, Digest, result.digest)
for file_content in files_content:
with open(os.path.join(get_buildroot(), file_content.path), "wb") as f:
f.write(file_content.content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a TODO that we should use #8329 (and noting that the reason we're ok with this horribly inefficient implementation is because it will disappear soon :D)

with open(os.path.join(get_buildroot(), file_content.path), "wb") as f:
f.write(file_content.content)

console.print_stdout(result.stdout)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these will print newlines if there's no stdout/stderr, so let's either wrap these in an if result.std*: check, or use console.write_std* to avoid those extra newlines. Or pass end=''.

@pierrechevalier83 pierrechevalier83 force-pushed the pchevalier/pre_black/black_support_v2 branch from e9e1eef to b1f7fc1 Compare September 27, 2019 13:41
@pierrechevalier83
Copy link
Contributor Author

I addressed some of the more minor comments and fixed a bug Daniel spotted. Making integration tests will take me a bit longer, so I'll add it to this PR later.

Copy link
Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat! Thanks gang.

Not deeply reviewing right now, but I think that #8329, and maybe some prework for #4535 should be blockers here? I'll add a comment to #4535 with something that we could potentially break out (edit: here).

@illicitonion
Copy link
Contributor

Not deeply reviewing right now, but I think that #8329, and maybe some prework for #4535 should be blockers here? I'll add a comment to #4535 with something that we could potentially break out (edit: here).

The hacks to work around those issues are pretty well localised, and this is a default-off goal (it doesn't shadow any existing goal namespace), so I think we should be good to merge this when it has some tests and do some incremental improvements as the other issues get resolved, but happy to discuss more if you disagree :)

@pierrechevalier83 pierrechevalier83 force-pushed the pchevalier/pre_black/black_support_v2 branch from 152192d to 0561398 Compare September 30, 2019 14:40
@pierrechevalier83
Copy link
Contributor Author

Added some integration tests

Copy link
Contributor

@illicitonion illicitonion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me, modulo remaining comments!

@stuhood How strongly do you want to block on other things?

from os.path import relpath

class PythonFmtIntegrationTest(PantsRunIntegrationTest):
def test_black_no_python_sources_should_noop(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one test should probably be in the core fmt's package, rather than the python one.

from pants.util.contextutil import temporary_file_path
from os.path import relpath

class PythonFmtIntegrationTest(PantsRunIntegrationTest):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add one test which does do formatting, e.g. by creating a temporary file in-line in the test?

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! +1 on IT for valid formatting change.

# and passing the full file names with the include option
dirs = set()
for filename in target.sources.snapshot.files:
dirs.add(os.path.dirname(filename))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead, use pathlib.Path! It's overall a much nicer API.

from pathlib import Path

dirs.add(Path(filename).parent)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw to go from Path -> str, call str(my_path) or f"{my_path}".

# The exclude option from Black only works on recursive invocations,
# so call black with the directories in which the files are present
# and passing the full file names with the include option
dirs = set()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Empty collections like this should be typed:

from typing import Set

dirs: Set[Path] = set()

(Path is if you take the below recommendation. Otherwise, it's Set[str].

if config_path:
pex_args += ("--config", config_path)
if target.sources.snapshot.files:
pex_args += ("--include", "|".join(re.escape(file) for file in target.sources.snapshot.files))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

file is a builtin. Instead, call it f or fp.

('digest', Digest),
('stdout', str),
('stderr', str),
])):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Conver to dataclass

from dataclasses import dataclass

@dataclass(frozen=True)
class FmtResult:
  digest: Digest
  stdout: str
  stderr: str

Make sure you include 3rdparty/python:dataclasses in BUILD.

# Blocked on: https://github.com/pantsbuild/pants/pull/8329
for file_content in files_content:
with open(os.path.join(get_buildroot(), file_content.path), "wb") as f:
f.write(file_content.content)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from pathlib import Path

Path(get_buildroot(), file_content.path).write_text(file_content.content.decode())

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or

from pathlib import Path

with Path(get_buildroot(), file_content.path).open('wb') as f:
  f.write(file_content.content)

if result.stderr:
console.print_stderr(result.stderr)

# workspace.materialize_directories(tuple(digests))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover?

@Eric-Arellano
Copy link
Contributor

Also when testing your new tests, be sure to run ./pants --no-v1 --v2 test, which is what CI uses by default.

]
merged_input_files = yield Get(
Digest,
DirectoriesToMerge,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use the two-arg form of Get in this case.

with open(os.path.join(get_buildroot(), file_content.path), "wb") as f:
f.write(file_content.content)

if result.stdout:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels likely to be noisy... but I suppose that if a tool knows that it won't produce anything interesting, it could skip including it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that can be made less verbose by batching the files later. I would probably not discard the output as I would want to see what happened as a user.

@pierrechevalier83 pierrechevalier83 force-pushed the pchevalier/pre_black/black_support_v2 branch 4 times, most recently from cfba4f3 to 5a08a00 Compare October 1, 2019 19:39
@pierrechevalier83
Copy link
Contributor Author

https://travis-ci.org/pantsbuild/pants/jobs/592156389
So ./pants reference failed. That's something we may want to do as part of the pre-commit hook in the future to help catch these kind of errors earlier.
Fixed for now in an extra commit.

* Add a fmt goal (named "fmt_v2" for now to avoid conflicts with the v1
  goal)
* Add a run_black rule which formats python code
Black is opinionated, which means that not much config is necessary.

Provide the minimum needed for pants' own taste:

line length of 100, which is consistent with previous practice (see
pre-existing .isort.cfg for a piece of evidence backing this statement)
exclude a few directories such as vitrualenv that really shouldn't be
touched by Black. We also exclude "compilation_failure". This pattern
matches a single file in our repo which trips Black up with its mix of
invalid python syntax and use of a unicode character.

When we run Black later, through pants or manually, it should just do
what we want without needing any CLI arguments.
Use existing example projects as input and assume they're
already formatted because that will be an easy invariant to maintain
once we enforce black formatting of all files.

Testing the actual formatting wouldn't bring a lot of value as we can
rely on black behaving properly in general, and we don't need to couple
these tests too much to black specific things.

These tests prove that black is indeed called on the right files and
with the right arguments in a few contexts, which covers the pants
integration of black.
This needed to be done as part of this commit as some example files were
formatted with Black for the integration tests to behave as they will
once we format the codebase.

Accepting this change means that we will want to format the code sooner
rather than later.
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there! Great work on this :)

# and passing the full file names with the include option
dirs: Set[str] = set()
for filename in target.sources.snapshot.files:
dirs.add(f"{Path(filename).parent}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes :)

# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unused

# Copyright 2019 Pants project contributors (see CONTRIBUTORS.md).
# Licensed under the Apache License, Version 2.0 (see LICENSE).

import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused I think

]
pants_run = self.run_pants(command=command)
code = open(file_name)
formatted = code.read();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead:

formatted = Path(file_name).read_text()

Has the additional benefit of closing the file for you.

@@ -327,6 +328,8 @@ def run_pants_with_workdir_without_waiting(self, command, workdir, config=None,
ini.write(fp)
args.append('--pants-config-files=' + ini_file_name)

Path('pyproject.toml').touch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, add //:pyproject to the requirements for pants.pex here:

python_binary(
name='pants_local_binary',
entry_point='pants.bin.pants_loader:main',
dependencies=[
':bin',
'//:build_root',
'//:build_tools',
'//:pants_ini',
'//:3rdparty_directory',
'build-support/checkstyle',
'build-support/eslint',
'build-support/ivy',
'build-support/mypy',
'build-support/pylint',
'build-support/regexes',
'build-support/scalafmt',
'build-support/scalastyle',
'contrib:plugins',
'pants-plugins/src/python/internal_backend:plugins',
],

Copy link
Contributor Author

@pierrechevalier83 pierrechevalier83 Oct 2, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had tried this before and it didn't work because the buildroot is not guarenteed to be the same. Fixing to take your suggestion into account and do "the right thing"

source='test_fmt_integration.py',
dependencies=[
'tests/python/pants_test:int-test',
'//:pyproject',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this because of the above recommendation.

'testprojects/tests/java/org/pantsbuild/testproject:dummies_directory',
],
tags={'integration'},
timeout = 90,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put this? This one IT probably shouldn't take 90 seconds.

'examples/src/python/example:hello_directory',
],
tags={'integration'},
timeout = 90,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does this take to run locally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's arbitrary. I thought 90 seconds was an OK default, but it should take less than 10 seconds. Just removed it and kept the default.



class FmtIntegrationTest(PantsRunIntegrationTest):
def test_black_no_python_sources_should_noop(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd generify this name given that it's in the core rules folder. Something like test_noop_when_no_fmt_available_for_target_type

@pierrechevalier83 pierrechevalier83 force-pushed the pchevalier/pre_black/black_support_v2 branch from 31162ad to ddca228 Compare October 2, 2019 15:52
@pierrechevalier83
Copy link
Contributor Author

Addressed the latest comments. I think CI should also be good now (except for spurious issues)

'examples/src/python/example/hello/greet:greet'
]
pants_run = self.run_pants(command=command)
formatted = Path(file_name).read_text();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary semicolon. I think Black will remove this in a bit so probably not worth a CI burn.

@@ -327,6 +328,9 @@ def run_pants_with_workdir_without_waiting(self, command, workdir, config=None,
ini.write(fp)
args.append('--pants-config-files=' + ini_file_name)

if not Path(get_buildroot()).samefile(Path(".")):
shutil.copy(str(Path(get_buildroot(), "pyproject.toml")), str(Path(".")))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm this is surprising and concerning to me. Are you sure it’s necessary? We don’t have to do this for any other tooling config files, and the change you made means that the pants.pex will already have the config in its chrooted folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure it's necessary. It tripped me up for a good part of the morning because I had the same expectation you had. That's why I ended up just touching the file, but I still think this latest version is more correct.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you include it as a dependency on the Pants.pex binary target like you have now AND the IT target? Not sure it will make a difference, but possibly.

I gotta run to class and don’t want to block on this because we can always address in a followup, but would love to make sure there’s no way to avoid this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Found how it works for pants.pex etc. Leads me to a slightly more elegant solution. Will push new commit. (Still a hack, but the same as others)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my latest commit (hopefully explains what's going on)

@illicitonion illicitonion merged commit 77aaa1e into pantsbuild:master Oct 3, 2019
@pierrechevalier83 pierrechevalier83 deleted the pchevalier/pre_black/black_support_v2 branch October 3, 2019 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants