Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define Unix shell toolchain #1117

Merged
merged 5 commits into from
Oct 15, 2019
Merged

Define Unix shell toolchain #1117

merged 5 commits into from
Oct 15, 2019

Conversation

aherrmann
Copy link
Member

As became apparent in #1097 Cabal requires standard Unix shell tools in $PATH. Packages with ./configure scripts even more so. So far these standard tools were brought into scope by setting use_default_shell_env=True. However, as pointed out in #1096 this can cause frequent cache invalidation.

This PR adds a new unix_toolchain which captures standard Unix shell tools as defined by IEEE Std 1003.1-2008. It comes with two configuration mechanisms.

  • unix_configure: Which just discovers the available shell tools in PATH in the loading phase. Note, this suffers the same issue as described in stack_snapshot get invalidated too often #1096. This is called automatically with haskell_register_ghc_bindists.
  • unix_nixpkgs: Which obtains the required tools from nixpkgs. This is called automatically with haskell_register_ghc_nixpkgs.

unix_toolchain assumes that the Unix tools are globally installed (Nix store, or $PATH).

unix_toolchain is used to amend $PATH in the Cabal wrapper. To prevent ./configure scripts and Cabal builds from using the wrong C compiler, we point Cabal to cc.tools.cc. (This currently doesn't work on Windows, see comment in code)

@regnat @Profpatsch If this works then you could rebase #1097 on this PR and use the unix_toolchain in the new Python Cabal wrapper.

Copy link
Contributor

@guibou guibou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any obvious error, but that's a huge change and my skills here are rusted.

Is this needed only because of nix? How is this done in other projects which are not dependent on nix but are all dependent on an unix toolchain?

I do agree with the API and the usefulness of this approach. Actually, I think you must alsa open an issue in bazel issue tracker to propose this API as well. Most of the hack in the bazel derivation for nixpkgs are linked with this and bazel may dramatically benefit from a way to specify this tools instead on relying on the default and non-hermetic current bazel behavior.

haskell/cabal.bzl Show resolved Hide resolved
Copy link
Contributor

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A big 💯 on the idea.

I wonder though whether rules_haskell is the right place to put it. Or rather: It's nice to have it here for the time being, but this shouldn't stay here (it isn't haskell specific at all, it just happens that it's a generic bit of thing that's missing for bazel and that rules_haskell requires), so maybe we should move it to its own repo to avoid having to migrate later (which will be a bit painful client-side since the repository names appear everywhere)

@mboes any opinion on that point ^ ?

haskell/private/unix/unix_nixpkgs.bzl Show resolved Hide resolved
haskell/cabal.bzl Show resolved Hide resolved
haskell/nixpkgs.bzl Show resolved Hide resolved
haskell/private/unix/unix_nixpkgs.bzl Show resolved Hide resolved
haskell/private/unix/unix_toolchain.bzl Show resolved Hide resolved
@thufschmitt
Copy link
Contributor

maybe we should move it to its own repo

Or if it makes things easier, we could have it be a different workspace in the rules_haskell tree − but I'm not sure that bazel's handling would make this easier

@aherrmann
Copy link
Member Author

aherrmann commented Oct 8, 2019

@guibou

Actually, I think you must alsa open an issue in bazel issue tracker to propose this API as well.

There is an open issue about this on the Bazel tracker and a related discussion on the mailing list.

Is this needed only because of nix? How is this done in other projects which are not dependent on nix but are all dependent on an unix toolchain?

To my knowledge there is no good solution to this problem right now, the issue above confirms that. Most projects simply assume that such tools are in PATH.

@regnat

wonder though whether rules_haskell is the right place to put it. [...]

I agree this would better fit into a standalone repository in the long run. I started it here since it's needed in the context of rules_haskell for now, and I think it's easier to iterate on this while it's in the same repository until we know that it meets our needs.

@mboes
Copy link
Member

mboes commented Oct 8, 2019

I have not reviewed this PR yet. I share @regnat's concern about how this should toolchain should be shared among all rule sets. IIRC there is a ticket about defining such a toolchain somewhere in the opened by @nlopezgi. I'd even suggest coordinating with upstream about what would be best to do here (how about a question on bazel-discuss@ ?).

The answer will likely be, go ahead with a private unix toolchain for now. In which case, it's okay to keep it in this repo for now.

@aherrmann
Copy link
Member Author

@Profpatsch CI fails with the following

Failures:
--
  |  
  | tests/RunTests.hs:70:7:
  | 1) failures //tests/failures/transitive-deps:lib-dFailure
  | uncaught exception: IOException of type InvalidArgument
  | fd:15: hGetContents: invalid argument (invalid byte sequence)
  |  
  | To rerun use: --match "/failures///tests/failures/transitive-deps:lib-dFailure/"
  |  
  | tests/RunTests.hs:70:7:
  | 2) failures //tests/failures/transitive-deps:lib-cFailure
  | uncaught exception: IOException of type InvalidArgument
  | fd:15: hGetContents: invalid argument (invalid byte sequence)
  |  
  | To rerun use: --match "/failures///tests/failures/transitive-deps:lib-cFailure/"

That failure already exists on master since #1115 see CI runs #424 and #426. So, it seems to be unrelated to this PR.

Copy link
Contributor

@thufschmitt thufschmitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mboes
Copy link
Member

mboes commented Oct 14, 2019

Looping in @laszlocsomor as well, the author of bazelbuild/bazel#5265.

@laszlocsomor
Copy link

Thanks for looping me in! IIUC the toolchain has an attribute for every Unix tool's absolute path (or None). This allows calling those tools as Bazel actions. Neat!

I wonder if we could use this toolchain in Bazel, maybe move it into Skylib. The toolchain would supersede ctx.actions.run_shell in some cases. Has anyone an idea how we could use it for genrules?

@aherrmann
Copy link
Member Author

Has anyone an idea how we could use it for genrules?

Toolchains can expose custom make variables. The toolchain could set one for each existing tool, e.g. $(UNIX_GREP).

@mboes
Copy link
Member

mboes commented Oct 15, 2019

@laszlocsomor it would be mucho good to move this upstream somewhere. As pointed out upthread in this discussion, the extra code introduced in this PR is basically all to address a very generic concern in Bazel, namely that run_shell with use_default_shell_env = True is non-hermetic but a necessary evil in very many cases in Bazel today.

Skylib looks like a good place to me, though I hope that the existence of this toolchain there won't be hard to discover for new rule developers. @aherrmann shall we merge this but immediately thereafter submit upstream there? @laszlocsomor let us know if on second thought there are better places for this.

@aherrmann aherrmann added the merge-queue merge on green CI label Oct 15, 2019
Defines a toolchain that captures standard Unix command line tools. For
now these tools are assumed to be external to Bazel, i.e. declared by
absolute paths. Shell tools can either be imported from nixpkgs using
`unix_nixpkgs`, or from the environment using `unix_configure`.

The GHC toolchain setup functions automatically call the corresponding
Unix toolchain setup functions.
Required if `<nixpkgs>` as defined by `repository` imports
additional files.
@aherrmann
Copy link
Member Author

@aherrmann shall we merge this but immediately thereafter submit upstream there?

Sounds good.

@mergify mergify bot merged commit 5c8066d into master Oct 15, 2019
@mergify mergify bot deleted the unix_toolchain branch October 15, 2019 08:12
@mergify mergify bot removed the merge-queue merge on green CI label Oct 15, 2019
@aherrmann
Copy link
Member Author

bazel-skylib PR: bazelbuild/bazel-skylib#208

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants