Optimize tail calls #115

irh · 2021-11-11T16:12:14Z

Currently no effort is made in Koto to optimize tail calls, making it easy to run into stack overflows when implementing recursive functions.

e.g. The following script takes a huge amount of memory to execute:

f = |x|
  if x == 0
    return
  f x - 1
f 123456789

The idea would be that if a terminating expression (i.e. throw or return (implicit returns for the last expression in a function included)) in a function is a call, then instead of creating a new frame, the current function's execution frame can be reused.

The text was updated successfully, but these errors were encountered:

irh · 2021-12-17T17:05:52Z

I looked at this a bit today with @alisomay and here are some notes from our discussion:

The optimization opportunity is in reusing registers in the value stack, rather than pushing a new frame's worth of registers into the stack.
The call stack (containing frame info) can be left alone for the time being, we wouldn't want to lose the stack trace in the case of an error, so at least the chunk and ip for each frame will be needed.
- There may be a compression optimization possible when the chunk and ip match in subsequent frames..
A new TailCall instruction can be added which performs a tail call, reusing the current frame's registers.
- In the case of a binary op, the RHS of the op would be the tail call, the LHS would have already been evaluated. The LHS result then needs to be cached, probably in register 0, and then the tail call frame base would be register 1, with result register also 1. The op can then be executed following the tail call before the function exits.
  - In the case of a binary op and the current frame's frame base is unused, then it would be possible to cache the LHS result in the frame base rather than register 0. How this would work exactly isn't clear (it would effectively be saying to the op that the LHS is in register -1), but it would still be a win to cache in register 0 so I think this doesn't have to be figured out to move forward.
To compile the TailCall instruction, the compiler could pass an 'allow tail call' flag around while compiling nodes. This flag would be set to true when compiling a return expression, and then set to false when a tail call should be explicitly disallowed (e.g. when compiling the LHS of a binary op).
- The 'result register' which is passed around in the 'compile node' functions could be replaced with a context struct (similar to the one used by the parser) which can then be extended with the new tail call flag.

irh · 2024-03-01T20:06:02Z

Some groundwork for this was done in #290. I didn't go any further because I'd like to understand more about how debugging might one day work before making any guarantees regarding tail calls.

irh added the performance label Nov 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize tail calls #115

Optimize tail calls #115

irh commented Nov 11, 2021 •

edited

Loading

irh commented Dec 17, 2021

irh commented Mar 1, 2024

Optimize tail calls #115

Optimize tail calls #115

Comments

irh commented Nov 11, 2021 • edited Loading

irh commented Dec 17, 2021

irh commented Mar 1, 2024

irh commented Nov 11, 2021 •

edited

Loading