-
Notifications
You must be signed in to change notification settings - Fork 994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large penalty for frequent side-trace exit #114
Comments
There is a BountySource bounty available to anybody who improves LuaJIT such that a modest number of unbiased branches (e.g. the three listed in the example above) could be included in a hot trace with only modest overhead. The example code on lukego/blog#8 could be used as an initial/partial test case. |
I have thought about how to implement this. This plan seems like it might work. It is based on LuaJIT using a 2-pass optimization scheme: one set of forward optimizations and one set of backwards optimizations.
This has some problems, such as a potentially exponential trace blow-up, but should still be an improvement over the current situation. |
There's no easy fix and no general solution with the current compiler backend. Closing, since there's already #37 for the actual solution. |
Implement asm_prof.
The overhead of a frequently taken branch to a side-trace can be dramatic. See lukego/blog#8 for a discussion. This is presumably the root of the optimization advice to "avoid unbiased branches" and "prefer branch-free algorithms".
I decided to file this issue with a bounty (see below) after a tweet from @mraleph raised the very exciting possibility that there may be a simple (partial) solution.
This is a problem for common applications. For example inner loops in networking code will naturally include some unbiased branches such as:
and to hoist such branches outside of inner loops (hot traces) can require elaborate programming practices, for example to sort packets into separate categories (taking the branch penalties) before processing them (doing the "heavy lifting" separate branch-free traces). This is unfortunate because it substantially "raises the bar" on how much expertise is required to write reasonably efficient code.
It would be wonderful if the JIT could somehow be enhanced such that a modest number of unbiased branches, such as the examples listed above, could be taken without a major performance penalty. This would bring efficient programming within the reach of more people and especially people without a strong mental model of trace-based just-in-time compilation (a large and interesting group!)
Here are some "in the wild" examples where a lot of programmer-time is being spent dealing with this issue:
and previous discussion in the LuaJIT community:
The text was updated successfully, but these errors were encountered: