primops: make nature of foldl' strictness clearer #7158

sternenseemann · 2022-10-11T12:26:21Z

Edit: Outdated change description!

I was a bit confused by the fact that primop_foldlStrict only (shallowly) forces the input list. To determine what a strict left fold “should” do, I looked at Haskell's foldl' which calls seq on the accumulation value (which you can see by the z :: b type annotation in the arguably confusing code). I think this is fair enough, since Haskell is where most people's intuition about foldl' will come from.

Forcing the accumulation value vCur has the benefit that it limits how nested the intermediate thunks in a fold can become which hopefully saves on used stack size in larger folds (I done any testing on the impact so far). The list elements are only forced insofar, as it is necessary when forcing vCur which means we still save on unnecessary computations.

I added two new test cases which demonstrate that a) vCur is forced and b) list elements are not necessarily forced. Additionally I did a sanity check, evaluating a nontrivial CI pipeline as well as multiple NixOS configurations which did not turn up any obvious regressions.

roberth · 2022-10-11T16:11:25Z

It does behave correctly for all the other accumulator values.
We should test that.

nix-repl> builtins.foldl' (_acc: x: x) {} [ (throw "even though this accumulator value isn't used, it must be forced") 1 ]                
error: even though this accumulator value isn't used, it must be forced

For the purpose of illustration only, we can look at the lazy foldl in Nixpkgs lib:

nix-repl> lib.foldl (_acc: x: x) {} [ (throw "failure: because this accumulator value isn't used, it must not be forced!") 1 ]       
1

I don't believe laziness in the first accumulator value is going to be a performance issue in the real world, whereas making it strict may cause real world issues, potentially breaking old expressions. While I judge that chance of breaking to be slim, I weigh this to be a bad risk compared to the even slimmer strictness/performance benefit.
To align better with other languages, we could add lib.foldl' to Nixpkgs with the added strictness in the initial accumulator.

{
  foldl' = f: acc0: l: builtins.seq acc0 (builtins.foldl' f acc0 l);
}

In conclusion, I'd recommend to add the above test case instead and leave the implementation as is, and even add a test case so we don't break acc0 laziness in the future.

sternenseemann · 2022-10-11T16:13:05Z

I did a simple benchmark using hyperfine and nixosTests.simple (the first commit is before this PR, the second commit is HEAD of my branch):

> hyperfine -w 5 -L commit ac0fb38e8a5a25a84fa17704bd31b453211263eb,aee7373357f20caaf14f02b72136b7950c55405d -s 'git checkout {commit} && make install'  './out/bin/nix-instantiate --readonly-mode -A nixosTests.simple ~/src/nix/nixpkgs/'
Benchmark 1: ./out/bin/nix-instantiate --readonly-mode -A nixosTests.simple ~/src/nix/nixpkgs/
  Time (mean ± σ):      2.360 s ±  0.052 s    [User: 2.232 s, System: 0.245 s]
  Range (min … max):    2.321 s …  2.493 s    10 runs
 
Benchmark 2: ./out/bin/nix-instantiate --readonly-mode -A nixosTests.simple ~/src/nix/nixpkgs/
  Time (mean ± σ):      2.376 s ±  0.075 s    [User: 2.245 s, System: 0.241 s]
  Range (min … max):    2.329 s …  2.583 s    10 runs
 
Summary
  './out/bin/nix-instantiate --readonly-mode -A nixosTests.simple ~/src/nix/nixpkgs/' ran
    1.01 ± 0.04 times faster than './out/bin/nix-instantiate --readonly-mode -A nixosTests.simple ~/src/nix/nixpkgs/'

So seems being lazier is slightly faster, but I'm not sure if it is significant enough to violate user's expectation.

sternenseemann · 2022-10-11T16:17:43Z

I don't believe laziness in the first accumulator value is going to be a performance issue in the real world, whereas making it strict may cause real world issues, potentially breaking old expressions. While I judge that chance of breaking to be slim, I weigh this to be a bad risk compared to the even slimmer strictness/performance benefit.

This is also fair enough, we can just make it clearer in the documentation that this fold is lazy. Another argument for keeping it as is would be that implementing a strict fold based on the lazy builtin is easy, but not the other way round, potentially leaving us with a much worse performing lazy fold in nixpkgs.

roberth · 2022-10-11T16:19:28Z

slightly faster, but I'm not sure if it is significant enough

The measurement difference is within σ, so it is not significant.
Considering that the change only affects the first accumulator, and that repeated forcing of an already forced value short-circuits quickly, I would not expect a significant difference.

roberth · 2022-10-11T16:23:42Z

this fold is lazy

This is not a sufficient characterization of this fold. It is only lazy in the initial accumulator, and only if f does not make it strict.

Another argument

This gets a bit "academic" so to speak. I don't think we have much of a choice, because we don't want to break existing expressions.

Profpatsch · 2022-10-11T16:24:46Z

Also note: lib.nix currently defines

  foldl = op: nul: list:
    let
      foldl' = n:
        if n == -1
        then nul
        else op (foldl' (n - 1)) (elemAt list n);
    in foldl' (length list - 1);

and

  foldl' = builtins.foldl' or foldl;

where foldl is not strict in any way.

roberth · 2022-10-11T16:58:57Z

  foldl' = builtins.foldl' or foldl;
where foldl is not strict in any way.

I believe this line to be a very old compatibility shim. Ideally it would wrap f to be strict in the accumulator, but that may also increase the stack requirements, making the stack overflows - that foldl' should solve - occur on even smaller inputs. Possibly not so ideal after all.

Profpatsch · 2022-10-11T17:04:33Z

Of note: since attrsets force their keys in any case, adding the additional strictness to foldl' does not make a difference when your accumulator is an attrset.

nix-repl> x = { a = 1; ${throw "no"} = 2; }

nix-repl> builtins.seq x 1
error: no

roberth · 2022-10-11T17:09:08Z

It does if x is an otherwise ignored accumulator.

This isn't new information though. We should define weak head normal form aka WHNF in the Nix manual so that we can reference it and not feel like we have document things through sporadic github comments.

Profpatsch · 2022-10-12T08:06:14Z

We should define weak head normal form aka WHNF in the Nix manual so that we can reference it and not feel like we have document things through sporadic github comments.

Not just that, but we should also document the strictness behavior of the builtin types (mostly lists and attrsets)

Profpatsch · 2022-10-12T08:07:37Z

It does if x is an otherwise ignored accumulator.

But that’s a different case. Of course if you never force the thing in the first place (by ignoring it), it won’t be looked at.

Profpatsch · 2022-10-12T08:08:30Z

Or maybe I don’t understand what you mean

sternenseemann · 2022-10-12T11:21:33Z

Seems like the only way in which foldl' is strict currently is that it forces the final result thunk before returning it, but at no other point. This strikes me as odd, since the thunk would be forced anyways by whatever forcing the thunk of the foldl' function application?!

roberth · 2022-10-12T13:25:27Z

Seems like the only way in which foldl' is strict currently is that it forces the final result thunk before returning it, but at no other point.

It does force intermediate thunks, except the initial one that you pass to foldl' in its second argument.
See the observation in #7158 (comment)

This strikes me as odd, since the thunk would be forced anyways by whatever forcing the thunk of the foldl' function application?!

That would have been a sensible thought, but it doesn't apply to the linked observation.

sternenseemann · 2022-10-12T15:01:31Z

It does force intermediate thunks, except the initial one that you pass to foldl' in its second argument.
See the observation in #7158 (comment)

This is not due to the final force, but seems due to the fact that we use callFunction in a strict way here (e.g. ExprCall::eval would call maybeThunk on the arguments which we don't do in foldl'). This is just my hypothesis, though, would be nice to get someone to confirm this who is more familiar with the evaluator code.

Edit: What I mean to say, using plain callFunction strictly calculates its return value, but not its arguments, so it wouldn't make a difference to move the force of vCur I introduced in this PR after the call.

Consider this diff:

diff --git a/src/libexpr/primops.cc b/src/libexpr/primops.cc
index 28b998474..3c2d589b3 100644
--- a/src/libexpr/primops.cc
+++ b/src/libexpr/primops.cc
@@ -2891,13 +2891,16 @@ static void prim_foldlStrict(EvalState & state, const PosIdx pos, Value * * args
         Value * vCur = args[1];
 
         for (auto [n, elem] : enumerate(args[2]->listItems())) {
             Value * vs []{vCur, elem};
             vCur = n == args[2]->listSize() - 1 ? &v : state.allocValue();
             state.callFunction(*args[0], 2, vs, *vCur, pos);
         }
-        state.forceValue(v, pos);
+        //state.forceValue(v, pos);
     } else {
-        state.forceValue(*args[1], pos);
+        //state.forceValue(*args[1], pos);
         v = *args[1];
     }
 }

We can still replicate your example:

nix-repl> builtins.foldl' (_acc: x: x) {} [ (throw "even though this accumulator value isn't used, it must be forced") 1 ]
error: even though this accumulator value isn't used, it must be forced

You can observe the difference by manually translating the expression into the equivalent calls (also with the above patch applied):

nix-repl> let op = _acc: x: x; in op (op {} (throw "foo")) 1 
1

Consequently, my originally proposed change hardly makes any difference in practice, though.

Edit: So I guess a way forward would be keep things as is, add more tests for this and try making the documentation a bit more accurate. Not sure if we can drop the final force before return, I have the suspicion it doesn't do anything?

roberth · 2022-10-12T15:39:59Z

builtins.foldl' (_acc: x: x) {} [ (throw "even though this accumulator value isn't used, it must be forced") 1 ]

This is actually a bit flawed. A foldl' that forces the list items instead of the accumulator would produce the same behavior.
A better test case is this:

# foldl' forces the accumulator values produced by the function. This must throw.
builtins.foldl' (_acc: item: item null) {} [ (_: throw "even though this accumulator value isn't used, it must be forced") (_: 1) ]

But an extra test case would make the same point

# foldl' is lazy in the list items when the function is lazy in the list item. This must return true.
builtins.foldl' (acc: item: acc) {} [ (throw "must be lazy in list items") ] == {}

Edit: So I guess a way forward would be keep things as is, add more tests for this and try making the documentation a bit more accurate.

sgtm

Not sure if we can drop the final force before return, I have the suspicion it doesn't do anything?

In theory the evaluator could apply a function and return a thunk for the body. I think this would check the set pattern, if any, but stop after that. I'd have to look up in the code what exactly happens though.

In general, I think not forcing a thunk may not always cause a problem because often there will be other opportunities for it to be forced anyway, but in a rare case it may not be. In other words, it's not easy to be sure.

PrimOpFun does not define any constraints in a comment, but that may not mean much. callFunction doesn't come with docs either, and I don't know this off the top of my head.

sternenseemann · 2022-10-15T23:38:37Z

@roberth I updated this change to only update the documentation of foldl', clarifying it a bit. Three new test cases are added which should do a decent job of tying down the current behavior.

roberth

Commented test cases are much appreciated ❤️

roberth · 2022-10-16T10:51:14Z

src/libexpr/primops.cc

-      evaluated first. For example, `foldl' (x: y: x + y) 0 [1 2 3]`
-      evaluates to 6.
+      ...`. For example, `foldl' (x: y: x + y) 0 [1 2 3]` evaluates to 6.
+      The return value of each application of `op` is evaluated strictly,


This would be easier if the Nix docs defined weak head normal form for Nix. Lacking such a precise definition, perhaps we could call it the root?

A bit nitpicky perhaps, but to stay accurate, I'd recommend to avoid "evaluate strictly", because the Nix language does not switch between lazy and strict evaluation, as "strictly" might suggest.

Suggested change

The return value of each application of `op` is evaluated strictly,

The root of the return value of each application of `op` is evaluated immediately,

I'll use “immediately”, but I think root is too confusing. I'd much rather fix that in one go with seq and deepSeq. In the Nix manual currently evaluate always implies only the root, anything beyond that is evaluate deeply. With the removal of the “strictly” I think it is relatively clear in context.

Let's do the WHNF stuff in a separate PR.

src/libexpr/primops.cc

tests/lang/eval-fail-foldlStrict-strict-op-application.nix

tests/lang/eval-okay-foldlStrict-lazy-initial-accumulator.nix

* Clarify the documentation of foldl': That the arguments are forced before application (?) of `op` is necessarily true. What is important to stress is that we force every application of `op`, even when the value turns out to be unused. * Move the example before the comment about strictness to make it less confusing: It is a general example and doesn't really showcase anything about foldl' strictness. * Add test cases which nail down aspects of foldl' strictness: * The initial accumulator value is not forced unconditionally. * Applications of op are forced. * The list elements are not forced unconditionally.

sternenseemann · 2023-01-25T13:45:54Z

Any blockers?

roberth · 2023-01-25T17:20:27Z

None as far as I'm concerned, but would like to discuss with the team before merging this.

nixos-discourse · 2023-02-13T15:54:10Z

This pull request has been mentioned on NixOS Discourse. There might be relevant details there:

https://discourse.nixos.org/t/2023-02-10-nix-team-meeting-minutes-31/25438/1

Ericson2314 · 2023-02-19T20:40:38Z

This is approved, the conversation is safe to mostly ignore since the work ended up being much simpler. The tests are surely better than non-tests, and if we are unsure about the documentation part we should split out the tests to merge that first.

infinisil · 2023-09-20T18:23:20Z

Haskell actually also had this problem originally, and it was only accidentally changed to the more intuitive behavior in 2015!

See also Which foldl'? and the corresponding comments on Reddit

infinisil · 2023-09-21T15:32:04Z

Btw I had to work around this here.

sternenseemann force-pushed the foldl-strict-accumulation-value branch from aee7373 to 4edbeba Compare October 15, 2022 23:37

sternenseemann changed the title ~~primops: force accumulation value in foldl'~~ primops: make nature of foldl' strictness clearer Oct 15, 2022

roberth reviewed Oct 16, 2022

View reviewed changes

sternenseemann force-pushed the foldl-strict-accumulation-value branch from 4edbeba to d0f2da2 Compare October 16, 2022 12:29

roberth approved these changes Oct 16, 2022

View reviewed changes

fricklerhandwerk added the documentation label Feb 10, 2023

thufschmitt assigned fricklerhandwerk Feb 10, 2023

fricklerhandwerk merged commit dda83a5 into NixOS:master Feb 19, 2023

sternenseemann deleted the foldl-strict-accumulation-value branch February 24, 2023 20:35

infinisil mentioned this pull request Sep 21, 2023

lib.lists.foldl': Make stricter, tests and docs NixOS/nixpkgs#256544

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

primops: make nature of foldl' strictness clearer #7158

primops: make nature of foldl' strictness clearer #7158

sternenseemann commented Oct 11, 2022 •

edited

Loading

roberth commented Oct 11, 2022

sternenseemann commented Oct 11, 2022

sternenseemann commented Oct 11, 2022

roberth commented Oct 11, 2022

roberth commented Oct 11, 2022

Profpatsch commented Oct 11, 2022

roberth commented Oct 11, 2022

Profpatsch commented Oct 11, 2022

roberth commented Oct 11, 2022 •

edited

Loading

Profpatsch commented Oct 12, 2022

Profpatsch commented Oct 12, 2022

Profpatsch commented Oct 12, 2022

sternenseemann commented Oct 12, 2022 •

edited

Loading

roberth commented Oct 12, 2022

sternenseemann commented Oct 12, 2022 •

edited

Loading

roberth commented Oct 12, 2022

sternenseemann commented Oct 15, 2022

roberth left a comment

roberth Oct 16, 2022

sternenseemann Oct 16, 2022

sternenseemann commented Jan 25, 2023

roberth commented Jan 25, 2023

nixos-discourse commented Feb 13, 2023

Ericson2314 commented Feb 19, 2023

infinisil commented Sep 20, 2023

infinisil commented Sep 21, 2023

	The return value of each application of `op` is evaluated strictly,
	The root of the return value of each application of `op` is evaluated immediately,

primops: make nature of foldl' strictness clearer #7158

primops: make nature of foldl' strictness clearer #7158

Conversation

sternenseemann commented Oct 11, 2022 • edited Loading

roberth commented Oct 11, 2022

sternenseemann commented Oct 11, 2022

sternenseemann commented Oct 11, 2022

roberth commented Oct 11, 2022

roberth commented Oct 11, 2022

Profpatsch commented Oct 11, 2022

roberth commented Oct 11, 2022

Profpatsch commented Oct 11, 2022

roberth commented Oct 11, 2022 • edited Loading

Profpatsch commented Oct 12, 2022

Profpatsch commented Oct 12, 2022

Profpatsch commented Oct 12, 2022

sternenseemann commented Oct 12, 2022 • edited Loading

roberth commented Oct 12, 2022

sternenseemann commented Oct 12, 2022 • edited Loading

roberth commented Oct 12, 2022

sternenseemann commented Oct 15, 2022

roberth left a comment

Choose a reason for hiding this comment

roberth Oct 16, 2022

Choose a reason for hiding this comment

sternenseemann Oct 16, 2022

Choose a reason for hiding this comment

sternenseemann commented Jan 25, 2023

roberth commented Jan 25, 2023

nixos-discourse commented Feb 13, 2023

Ericson2314 commented Feb 19, 2023

infinisil commented Sep 20, 2023

infinisil commented Sep 21, 2023

sternenseemann commented Oct 11, 2022 •

edited

Loading

roberth commented Oct 11, 2022 •

edited

Loading

sternenseemann commented Oct 12, 2022 •

edited

Loading

sternenseemann commented Oct 12, 2022 •

edited

Loading