Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recursive Nix support #3205

Merged
merged 5 commits into from
Dec 2, 2019
Merged

Recursive Nix support #3205

merged 5 commits into from
Dec 2, 2019

Conversation

edolstra
Copy link
Member

@edolstra edolstra commented Nov 4, 2019

This allows Nix builders to call Nix to build derivations, with some limitations.

Example:

let nixpkgs = fetchTarball channel:nixos-18.03; in
    
with import <nixpkgs> {};
    
runCommand "foo"
  {
    buildInputs = [ nix jq ];
    NIX_PATH = "nixpkgs=${nixpkgs}";
  }
  ''
    hello=$(nix-build -E '(import <nixpkgs> {}).hello.overrideDerivation (args: { name = "hello-3.5"; })')
    
    $hello/bin/hello
    
    mkdir -p $out/bin
    ln -s $hello/bin/hello $out/bin/hello
    
    nix path-info -r --json $hello | jq .
  ''

This derivation makes a recursive Nix call to build GNU Hello and symlinks it from its $out, i.e.

# ll ./result/bin/
lrwxrwxrwx 1 root root 63 Jan  1  1970 hello -> /nix/store/s0awxrs71gickhaqdwxl506hzccb30y5-hello-3.5/bin/hello
    
# nix-store -qR ./result
/nix/store/hwwqshlmazzjzj7yhrkyjydxamvvkfd3-glibc-2.26-131
/nix/store/s0awxrs71gickhaqdwxl506hzccb30y5-hello-3.5
/nix/store/sgmvvyw8vhfqdqb619bxkcpfn9lvd8ss-foo

This is implemented as follows:

  • Before running the outer builder, Nix creates a Unix domain socket .nix-socket in the builder's temporary directory and sets $NIX_REMOTE to point to it. It starts a thread to process connections to this socket. (Thus you don't need to have nix-daemon running.)

  • The daemon thread uses a wrapper store (RestrictedStore) to keep track of paths added through recursive Nix calls, to implement some restrictions (see below), and to do some censorship (e.g. for
    purity, queryPathInfo() won't return impure information such as signatures and timestamps).

  • After the build finishes, the output paths are scanned for references to the paths added through recursive Nix calls (in addition to the inputs closure). Thus, in the example above, $out has a reference to $hello.

The main restriction on recursive Nix calls is that they cannot do arbitrary substitutions. For example, doing

nix-store -r /nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10

is forbidden unless /nix/store/kmwd... is in the inputs closure or previously built by a recursive Nix call. This is to prevent irreproducible derivations that have hidden dependencies on substituters or the current store contents. Building a derivation is fine, however, and Nix will use substitutes if available. In other words, the builder has to present proof that it knows how to build a desired store path from scratch by constructing a derivation graph for that path.

Probably we should also disallow instantiating/building fixed-output derivations (specifically, those that access the network, but currently we have no way to mark fixed-output derivations that don't
access the network). Otherwise sandboxed derivations can bypass sandbox restrictions and access the network.

When sandboxing is enabled, we make paths appear in the sandbox of the builder by entering the mount namespace of the builder and bind-mounting each path. This is tricky because we do a pivot_root() in the builder to change the root directory of its mount namespace, and thus the host /nix/store is not visible in the mount namespace of the builder. To get around this, just before doing pivot_root(), we branch a second mount namespace that shares its /nix/store mountpoint
with the parent.

Recursive Nix currently doesn't work on macOS in sandboxed mode (because we can't change the sandbox policy of a running build) and on Linux in non-root mode (because setns() barfs).

This PR also adds some ccache-like functionality to Nix's makefiles that wraps GCC calls in Nix derivations to enable caching and remote builds. This requires recursive Nix when you want to do this inside a Nix build.

Implements #13.

@zimbatm
Copy link
Member

zimbatm commented Nov 4, 2019

Assuming that the inner build is quite large. How would the build be distributed to the same set of remote builders as the outer build?

One thing I was wondering is, if it would make sense for the inner build to just return a drv file in $out instead, and then let the outer scheduler update its build plan accordingly. This would be a bit closer to IFD but where the evaluation happens in a builder instead of all in the client.

@Ericson2314
Copy link
Member

Ericson2314 commented Nov 4, 2019

@edolstra I am worry about exposing nix-build-in-nix-build before we do nix-instantiate-in-nix-build (my RFC). Your commit messages do mention putting behind a feature flag, and if we ban it in Nxpkgs initially that alleviates my worries. But, let me lay out those worries.

In short, while I think nix-build-in-nix-build is a decent last to retrofit existing shody stuff, it's never the way anything should strive to work. nix-build-in-nix-build is worse because:

  • Timing is observable: derivations can observe how long nested nix-builds take, and therefore whether the thing they want to build was built before. This weakens our purity/caching. Even if we aren't building adversarial stuff that tries to exploit this, it still could could painful, hard to repoduce bugs.

  • "By default", it's sequential: you have to wait for the last nix-build to finish in your script, or waste effort writing your own parallelization code. With "ret cont" you don't need to put in effort to get better scheduling.

  • Resource usage: Derivations that idle waiting for nested nix-build waste space. I call nix-instantiate-innix-build "ret cont" because compared to this you are morally serializing your continuation in the "sucessor" derivation. That successor derivation will almost surely be a lot smaller.

I don't want to be in a position where people write a bunch of stuff that uses nix-build-in-nix-build because of this PR, and then no one has the energy to rewrite it for nix-instantiate-in-nix-build. Even worse than the code is everybody getting excited about this, and then learning a bunch of other stuff---I'll admit ret-cont is weirder up front and more of an "unlearning" step. I'd like to avoid that tech debt and culture whiplash.

@edolstra
Copy link
Member Author

edolstra commented Nov 5, 2019

I've made an initial version of a nix-ccache flake: https://github.com/edolstra/nix-ccache. It provides a wrapper around gcc/g++ that executes the compilation of the preprocessed source in a recursive nix-build call.

This allows Nix builders to call Nix to build derivations, with some
limitations.

Example:

  let nixpkgs = fetchTarball channel:nixos-18.03; in

  with import <nixpkgs> {};

  runCommand "foo"
    {
      buildInputs = [ nix jq ];
      NIX_PATH = "nixpkgs=${nixpkgs}";
    }
    ''
      hello=$(nix-build -E '(import <nixpkgs> {}).hello.overrideDerivation (args: { name = "hello-3.5"; })')

      $hello/bin/hello

      mkdir -p $out/bin
      ln -s $hello/bin/hello $out/bin/hello

      nix path-info -r --json $hello | jq .
    ''

This derivation makes a recursive Nix call to build GNU Hello and
symlinks it from its $out, i.e.

  # ll ./result/bin/
  lrwxrwxrwx 1 root root 63 Jan  1  1970 hello -> /nix/store/s0awxrs71gickhaqdwxl506hzccb30y5-hello-3.5/bin/hello

  # nix-store -qR ./result
  /nix/store/hwwqshlmazzjzj7yhrkyjydxamvvkfd3-glibc-2.26-131
  /nix/store/s0awxrs71gickhaqdwxl506hzccb30y5-hello-3.5
  /nix/store/sgmvvyw8vhfqdqb619bxkcpfn9lvd8ss-foo

This is implemented as follows:

* Before running the outer builder, Nix creates a Unix domain socket
  '.nix-socket' in the builder's temporary directory and sets
  $NIX_REMOTE to point to it. It starts a thread to process
  connections to this socket. (Thus you don't need to have nix-daemon
  running.)

* The daemon thread uses a wrapper store (RestrictedStore) to keep
  track of paths added through recursive Nix calls, to implement some
  restrictions (see below), and to do some censorship (e.g. for
  purity, queryPathInfo() won't return impure information such as
  signatures and timestamps).

* After the build finishes, the output paths are scanned for
  references to the paths added through recursive Nix calls (in
  addition to the inputs closure). Thus, in the example above, $out
  has a reference to $hello.

The main restriction on recursive Nix calls is that they cannot do
arbitrary substitutions. For example, doing

  nix-store -r /nix/store/kmwd1hq55akdb9sc7l3finr175dajlby-hello-2.10

is forbidden unless /nix/store/kmwd... is in the inputs closure or
previously built by a recursive Nix call. This is to prevent
irreproducible derivations that have hidden dependencies on
substituters or the current store contents. Building a derivation is
fine, however, and Nix will use substitutes if available. In other
words, the builder has to present proof that it knows how to build a
desired store path from scratch by constructing a derivation graph for
that path.

Probably we should also disallow instantiating/building fixed-output
derivations (specifically, those that access the network, but
currently we have no way to mark fixed-output derivations that don't
access the network). Otherwise sandboxed derivations can bypass
sandbox restrictions and access the network.

When sandboxing is enabled, we make paths appear in the sandbox of the
builder by entering the mount namespace of the builder and
bind-mounting each path. This is tricky because we do a pivot_root()
in the builder to change the root directory of its mount namespace,
and thus the host /nix/store is not visible in the mount namespace of
the builder. To get around this, just before doing pivot_root(), we
branch a second mount namespace that shares its /nix/store mountpoint
with the parent.

Recursive Nix currently doesn't work on macOS in sandboxed mode
(because we can't change the sandbox policy of a running build) and in
non-root mode (because setns() barfs).
Derivations that want to use recursion should now set

  requiredSystemFeatures = [ "recursive-nix" ];

to make the daemon socket appear.

Also, Nix should be configured with "experimental-features =
recursive-nix".
@matthewbauer
Copy link
Member

matthewbauer commented Nov 6, 2019

I've made an initial version of a nix-ccache flake: https://github.com/edolstra/nix-ccache. It provides a wrapper around gcc/g++ that executes the compilation of the preprocessed source in a recursive nix-build call.

Very cool! This looks a lot like what @layus talks about at NixCon. How costly is it to run every compilation in nix-build, though? Perhaps we need some heuristic to determine whether a C file is big enough to be cacheable, otherwise we impose a constant builder setup for every .c file.

@edolstra
Copy link
Member Author

edolstra commented Nov 7, 2019

A quick unscientific measurement suggests the overhead is ~0.15s per GCC call on my laptop. (This also depends on the size of the preprocessor output, since it needs to be copied to the Nix store.) This is enough to make configure scripts much slower, so right now there is a special check to disable building through recursive Nix when the input is called "conftest". A heuristic like you suggest might be better.

@Ericson2314
Copy link
Member

@volth you just reinvented import from derivation :) But a big benefit of recursive (to me at least) is trying to leverage eval less not more, i.e. nix-exprs can just be one unprivileged way to get drv files.

Other than that< I think you might prefer NixOS/rfcs#40.

  1. inner nix-build's settings will be consistent with the outer

Ret-cont recursive also does this.

  1. returning the list of derivations instead of raw stdout capture of nix-build executable

Never need to go stdout->store path either.

@edolstra
Copy link
Member Author

Wouldn't it be more flexible to add it in form of builtins.nixBuild accepting string of nix code to eval and returning a list of derivations?

Well, hello was just a toy example. The main usefulness of recursive Nix is if you don't know the inner derivations in advance. For example, in the case of nix-ccache (which wraps gcc invocations in nix builds), it's typically a makefile driving the build process, so you don't know at the outer expression level which inner builds are going to be done, or in what order.

@edolstra edolstra merged commit 69326f3 into master Dec 2, 2019
@edolstra edolstra deleted the recursive-nix branch December 2, 2019 12:00
@domenkozar domenkozar mentioned this pull request Dec 20, 2019
@kolloch
Copy link
Contributor

kolloch commented Apr 25, 2020

Hi, I am searching for a working example with recursive nix.

Does it work with the nix in nixpkgs-unstable? Are there docs already?

@zimbatm
Copy link
Member

zimbatm commented Apr 25, 2020

@kolloch have a look at https://github.com/NixOS/nix/pull/3205/files#diff-e9794c2e4d63a50bf65a1a0ce0873a19 for an example. The feature is hidden behind a feature flag.

In terms of Nix releases, it looks like it's only available in master at the moment.

@kolloch
Copy link
Contributor

kolloch commented Apr 26, 2020

I assume that does that mean my nix daemon also has to be from master?

@zimbatm
Copy link
Member

zimbatm commented Apr 26, 2020

Yes, in this case the daemon needs to be updated as well since it controls the build sandbox.

@mlvzk
Copy link
Member

mlvzk commented Aug 24, 2020

When will this land in stable?

@zimbatm
Copy link
Member

zimbatm commented Aug 24, 2020

When Nix 3.0 will be released. Or use nixpkgs.nixUnstable

@roberth
Copy link
Member

roberth commented Jul 14, 2021

Probably we should also disallow instantiating/building fixed-output derivations (specifically, those that access the network, but currently we have no way to mark fixed-output derivations that don't
access the network). Otherwise sandboxed derivations can bypass sandbox restrictions and access the network.

Fixed-output derivations (FODs) would be part of a recursive Nix solution for fetchNodeModules. I'd like for it to leverage Nix's ability to cache downloads in the form of FOD and share the downloaded modules in the form of store paths.

Instead of prohibiting FODs, we could stop the console logs from going to the recursive Nix socket and send them directly to the calling derivation's log instead. This way, the sandboxed build can only determine whether an FOD is fetchable. That's still sufficient to extract information from the network bit by bit, so we'd also have to "prohibit" failures by killing the parent derivation whenever a recursive derivation fails.

It's worth noting that the ret-cont solution does not have this problem, but can support the fetchNodeModules use case.

@roberth roberth mentioned this pull request Jul 14, 2021
11 tasks
@kvtb
Copy link
Contributor

kvtb commented Sep 12, 2021

any chance for backporting to 2.3 ?

@lukego
Copy link

lukego commented Oct 25, 2022

The example does not build for me. I wonder why?

I'm running NixOS from nixpkgs master with

  nix.package = pkgs.nixUnstable;
  nix.extraOptions = ''
    experimental-features = nix-command flakes recursive-nix
    trusted-users = luke
  '';

and I have copied the example derivation from the PR into example.nix but the various nix build commands I've tried are all failing:

[luke@snowy:~]$ nix build -L -f example.nix
foo> error: creating directory '/nix/var': Permission denied
error: builder for '/nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv' failed with exit code 1;
       last 1 log lines:
       > error: creating directory '/nix/var': Permission denied
       For full logs, run 'nix log /nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv'.

[luke@snowy:~]$ sudo nix build -L -f example.nix
foo> error: creating directory '/nix/var': Permission denied
error: builder for '/nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv' failed with exit code 1

[luke@snowy:~]$ sudo nix --experimental-features 'nix-command recursive-nix' build -L -f example.nix
foo> error: creating directory '/nix/var': Permission denied
error: builder for '/nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv' failed with exit code 1

[luke@snowy:~]$ sudo nix --experimental-features 'nix-command recursive-nix' build --impure -L -f example.nix
foo> error: creating directory '/nix/var': Permission denied
error: builder for '/nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv' failed with exit code 1

[luke@snowy:~]$ sudo nix --experimental-features 'nix-command recursive-nix' build --no-sandbox --impure -L -f example.nix
foo> error: cannot open connection to remote store 'daemon': error: reading from file: Connection reset by peer
error: builder for '/nix/store/ikqxl3gpd54ww9czp0hxydjf0i15a74y-foo.drv' failed with exit code 1

Can anyone see which requirement I'm failing to fulfil? I'd love to use recursive Nix.

@lukego lukego mentioned this pull request Oct 28, 2022
@jkarni
Copy link

jkarni commented Dec 1, 2022

@lukego I also have the same issue. Have you been able to find a solution?

@lukego
Copy link

lukego commented Dec 1, 2022

@jkarni Yes, I have been able to get past this problem and onto the next problem with recursive nix, that it hangs in large builds 😁. See #7297 for more on that and a (hopefully) working small example for reference.

I am ahem 80% sure that the problem above was resolved by adding requiredSystemFeatures = [ "recursive-nix" ].

@basile-henry
Copy link

I am ahem 80% sure that the problem above was resolved by adding requiredSystemFeatures = [ "recursive-nix" ].

I can confirm that the system feature was needed for me as well as enabling the experimental feature!

Why is that the case though? Where did you find documentation about this?
I didn't think nix ever cared about the names of system features, I always understood system features as a way to match derivations with builders, not a way to opt into nix experimental features I have already opted into (via /etc/nix/nix.conf or CLI).

@roberth
Copy link
Member

roberth commented Oct 15, 2024

When using remote builders, you wouldn't want to schedule a derivation that requires the experimental feature onto a machine where it's not enabled, so Nix must require both.

@basile-henry
Copy link

Ah, so recursive-nix is a special feature because under the hood it uses the local machine as a remote builder?

you wouldn't want to schedule a derivation that requires the experimental feature onto a machine where it's not enabled

I understand that, but would have expected it to be on the user to ensure the experimental nix feature was enabled (either using system features, or simply by ensuring all their remote builders have the experimental nix feature enabled).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.