Overwriting primitives in lambda #3107

anfelor · 2024-10-02T19:01:19Z

This adds the basic primitives needed to support overwriting to lambda.

In the middle end, we call todo_overwrite_not_implemented.
In value_recursion_compiler, we disallow overwriting
In tmc we allow overwriting.
In bytecode, we map reuses to fresh allocations.

goldfirere · 2024-10-02T19:31:03Z

ocaml/lambda/lambda.mli

+        mode : alloc_mode;                             (* Alloc mode of resulting block *)
+      }
+
+and reset_field =


As @anfelor and I discussed over Slack, I think it would be a significant win to change the representation here, including reset_field elements only when we are actually changing a field. (One way to understand this is the observation that 'a option list is isomorphic to (int * 'a) list, ignoring trailing Nones.) Not sure what the consumer of this is going to look like, so maybe we can do something even better than an int to count the missing fields, but storing an int is better than making a list with likely a lot of Nones in it.

See bytegen.ml for a consumer of this representation

Note that each None corresponds to an underscore in the source program (eg. overwrite x with (_,3)). We should thus expect the number of Nones to be somewhat small and should focus just on what the best representation is for code quality.

I would have thought the list representation is easier actually.

Which list representation do you mean? The 'a option list in the PR or the (int * 'a) list proposed by Richard?

And I don't think the number of Nones will be all that small, in the case of record updates: {x with f} can have an arbitrarily large number of Nones.

Since Mark seemed happy with the current design, I would propose we keep the representation as it is now.

ocaml/lambda/lambda.mli

goldfirere · 2024-10-02T19:34:01Z

ocaml/lambda/lambda.ml

@@ -1713,6 +1732,10 @@ let primitive_may_allocate : primitive -> alloc_mode option = function
  | Pmakefloatblock (_, m) -> Some m
  | Pmakeufloatblock (_, m) -> Some m
  | Pmakemixedblock (_, _, _, m) -> Some m
+  | Preuseblock { mode } -> Some mode


But these new primitives can't allocate, can they? It's one of the reasons I don't think the mode is necessary.

The mode will soon also contain uniqueness information that informs us whether the new block will itself be used uniquely. One worry then might be if the optimizer can lift out Preuseblocks that are unique? (In fact, it can't: it could only do that if the cell we reuse is statically known, but that cell will have been marked as unique and can not be lifted out). Furthermore, we sometimes handle Pmakeblock and Preuseblock similarly (eg. in tmc.ml) and it might be helpful if they have the same fields.

This line is in the function primitive_may_allocate, and returning Some mode makes it sound like it might allocate. (Perhaps my comment about the mode buried my main point.)

One worry: it does allocate in bytecode. Does that matter here?

It would be sound to keep the Some mode here, since primitive_may_allocate is only used in transl_primitive where this value causes strictly more checks. Since we have decided to map Preuseblock to Pmakeblock in the backend for the moment, I think we should stick with this code for now. However, I will add a CR to make all of these return None once the backend supports actual overwriting. The bytecode interpreter does not use stack allocation so this check is irrelevant there.

ocaml/lambda/lambda.ml

ocaml/parsing/location.ml

goldfirere · 2024-10-02T19:48:54Z

ocaml/lambda/tmc.ml

    let choices = List.map (choice ctx ~tail:false) blockargs in
    match Choice.find_nonambiguous_tmc_call choices with
    | Choice.No_tmc_call args ->
-        Choice.lambda @@ Lprim (Pmakeblock (tag, flag, shape, mode), args, loc)
+        let block = Constr.makeblock (tag, flag, shape, is_reusing, mode) in
+        Choice.lambda @@ Lprim (block, args, loc)


It's not clear to me that args should be the same here for the two cases of is_reusing. That is, when is_reusing says to allocate, args must be all the fields of the constructor. But when is_reusing says to reuse, args is only the changed fields. But that's not what this code does.

Yes, for the tmc analysis we only need to consider the changed fields, since only they can contain a recursive call. The same code works in both cases.

goldfirere · 2024-10-02T19:52:27Z

ocaml/lambda/tmc.ml

I would expect some tests for the treatment here -- even if the tests just print the lambda code and then fall over. Or at least a loud comment somewhere reminding us to test this thoroughly before release. A mistake here could lead to miscompilation.

ocaml/bytecomp/bytegen.ml

goldfirere · 2024-10-02T19:58:22Z

ocaml/bytecomp/bytegen.ml

This, too, should get a bunch of testing.

anfelor · 2024-10-03T14:42:52Z

I agree that the additions to tmc and bytegen should be tested before we merge this. How about we first merge #3089 and then add some code to the type-checker that maps the overwrites to these primitives? Then we can run a few simple programs (like the unique_rbtree or unique_sort from the old PR) to test this code.

goldfirere · 2024-10-03T18:47:40Z

How about we first merge #3089 and then add some code to the type-checker that maps the overwrites to these primitives? Then we can run a few simple programs (like the unique_rbtree or unique_sort from the old PR) to test this code.

OK. I'll focus there and then we can return to this one.

…primitives

mshinwell · 2024-10-30T14:06:33Z

ocaml/lambda/lambda.mli

+    (* In-place reuse of heap blocks.
+     Each of these takes the allocation to be reused as first argument.
+     The following arguments correspond to the fields that are Reuse_set. *)
+  | Preuseblock of


@chambart and I were just discussing that it's probably easier if these primitives return the block which has been updated, so that subsequent uses of the unique block use the new variable (bound to the result of such primitives). This may help produce a more straightforward handling of the primitives in Flambda 2, in particular to help enforce the necessary "barrier" behaviour.

Oh, right, that is the intended meaning. If Preuseblock was a function, we should have: Preuseblock : memory cell -> overwritten args -> new block. You should not refer to the overwritten memory cell after it was passed to Preuseblock. In fact, referring to the overwritten memory cell again would break lots of properties since the overwrite may change its type.

anfelor · 2024-12-06T20:12:38Z

I have rebase the changes in this PR onto main and created a draft PR at #3350

anfelor added 5 commits October 2, 2024 11:17

Draft overwriting primitives in Lambda (extracted from 3f1e495)

2aebe45

Check that tmc with overwriting is correct

bc3e854

Disallow value recursion with overwriting

10f7d69

Map reuse to fresh allocations in bytecode

377d0f1

Restore formatting

7e8af0a

anfelor requested a review from goldfirere October 2, 2024 19:01

mshinwell added uniqueness lambda Lambda language changes labels Oct 2, 2024

goldfirere reviewed Oct 2, 2024

View reviewed changes

anfelor added 2 commits October 3, 2024 10:09

Fix formatting

1d0b9e5

Implement suggestions

ca6c04b

anfelor added 2 commits October 22, 2024 11:53

Add CR for primitive_may_allocate

82b9968

Merge remote-tracking branch 'upstream/overwriting' into overwriting-…

e9c4870

…primitives

mshinwell reviewed Oct 30, 2024

View reviewed changes

anfelor closed this Dec 6, 2024

anfelor mentioned this pull request Dec 6, 2024

Overwriting in bytecode #3350

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overwriting primitives in lambda #3107

Overwriting primitives in lambda #3107

anfelor commented Oct 2, 2024

goldfirere Oct 2, 2024

anfelor Oct 2, 2024

anfelor Oct 3, 2024

mshinwell Oct 3, 2024

anfelor Oct 3, 2024

goldfirere Oct 3, 2024

anfelor Oct 22, 2024

goldfirere Oct 2, 2024

anfelor Oct 2, 2024

goldfirere Oct 3, 2024

anfelor Oct 22, 2024

goldfirere Oct 2, 2024

anfelor Oct 3, 2024

goldfirere Oct 2, 2024

goldfirere Oct 2, 2024

anfelor commented Oct 3, 2024

goldfirere commented Oct 3, 2024

mshinwell Oct 30, 2024

anfelor Oct 30, 2024

anfelor commented Dec 6, 2024

Overwriting primitives in lambda #3107

Overwriting primitives in lambda #3107

Conversation

anfelor commented Oct 2, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anfelor commented Oct 3, 2024

goldfirere commented Oct 3, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

anfelor commented Dec 6, 2024