Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize tail calls #115

Open
irh opened this issue Nov 11, 2021 · 2 comments
Open

Optimize tail calls #115

irh opened this issue Nov 11, 2021 · 2 comments

Comments

@irh
Copy link
Contributor

irh commented Nov 11, 2021

Currently no effort is made in Koto to optimize tail calls, making it easy to run into stack overflows when implementing recursive functions.

e.g. The following script takes a huge amount of memory to execute:

f = |x|
  if x == 0
    return
  f x - 1
f 123456789

The idea would be that if a terminating expression (i.e. throw or return (implicit returns for the last expression in a function included)) in a function is a call, then instead of creating a new frame, the current function's execution frame can be reused.

@irh
Copy link
Contributor Author

irh commented Dec 17, 2021

I looked at this a bit today with @alisomay and here are some notes from our discussion:

  • The optimization opportunity is in reusing registers in the value stack, rather than pushing a new frame's worth of registers into the stack.
  • The call stack (containing frame info) can be left alone for the time being, we wouldn't want to lose the stack trace in the case of an error, so at least the chunk and ip for each frame will be needed.
    • There may be a compression optimization possible when the chunk and ip match in subsequent frames..
  • A new TailCall instruction can be added which performs a tail call, reusing the current frame's registers.
    • In the case of a binary op, the RHS of the op would be the tail call, the LHS would have already been evaluated. The LHS result then needs to be cached, probably in register 0, and then the tail call frame base would be register 1, with result register also 1. The op can then be executed following the tail call before the function exits.
      • In the case of a binary op and the current frame's frame base is unused, then it would be possible to cache the LHS result in the frame base rather than register 0. How this would work exactly isn't clear (it would effectively be saying to the op that the LHS is in register -1), but it would still be a win to cache in register 0 so I think this doesn't have to be figured out to move forward.
  • To compile the TailCall instruction, the compiler could pass an 'allow tail call' flag around while compiling nodes. This flag would be set to true when compiling a return expression, and then set to false when a tail call should be explicitly disallowed (e.g. when compiling the LHS of a binary op).
    • The 'result register' which is passed around in the 'compile node' functions could be replaced with a context struct (similar to the one used by the parser) which can then be extended with the new tail call flag.

@irh
Copy link
Contributor Author

irh commented Mar 1, 2024

Some groundwork for this was done in #290. I didn't go any further because I'd like to understand more about how debugging might one day work before making any guarantees regarding tail calls.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant