Figure out which LLVM optimisation passes are worth enabling #595

yorickpeterse · 2023-07-19T02:31:47Z

Right now the only optimisation pass we enable is the mem2reg pass, because that's pretty much a requirement for non-insane machine code. We deliberately don't use the O2/O3 options as they enable far too many optimisation passes, and don't give you the ability to opt-out of some of them (Swift takes a similar approach).

We should start collecting a list of what passes are worth enabling, and ideally what the compile time cost is versus the runtime improvement. The end goal is to basically enable the passes that give a decent amount of runtime performance improvements, but without slowing down compile times too much.

yorickpeterse · 2023-11-17T15:06:23Z

From jinyus/related_post_gen#440 (comment): using OptimizationLevel::Aggressive can have a big impact on the performance compared to None. In itself this isn't surprising, because of course optimizations are beneficial. I however would like to know (somehow) which optimizations are worth enabling, rather than just enabling something as opaque as -O3.

Perhaps as a starting point we can just set that option when using inko build --aggressive, then figure out which ones to explicitly enable for regular builds.

When using `inko build --opt=aggressive`, we not set LLVM's optimization level to "aggressive", which is the equivalent of -O3 for clang. This gives users to ability to have their code optimized at least somewhat, provided they're willing to deal with the significant increase in compile times. For example, Inko's test suite takes about 3 seconds to compile without optimizations, while taking just under 10 seconds when using --opt=aggressive. The option --opt=balanced still doesn't apply optimizations as we've yet to figure out which ones we want to explicitly opt-in to. See #595 for more details. Changelog: performance

yorickpeterse · 2023-11-17T16:41:50Z

1a30de9 changes inko build such that --opt=aggressive applies the equivalent of clang's -O3. This significantly increases compile times, but it's better than nothing until we come up with our own list of passes to enable.

yorickpeterse · 2024-11-23T03:21:34Z

At leas the following passes are worth looking into more, based on playing around with them to see what effect they have:

instcombine
gvn
sroa (gets rid of redundant alloca instructions and their loads/stores)
simplifycfg (simplifies the CFG, mostly useful for debugging I think)

yorickpeterse · 2024-11-27T02:47:33Z

Worth adding: even with --opt=aggressive, certain methods such as Int.% aren't performing very well by the looks of it. For example, take this snippet (based on https://github.com/bddicken/languages):

import std.env (arguments)
import std.int (Format)
import std.rand (Random)
import std.stdio (Stdout)

class async Main {
  fn async main {
    let out = Stdout.new
    let rand = Random.new
    let n = Int.parse(arguments.get(0), Format.Decimal).get
    let r = rand.int_between(0, 10_000)
    let a = Array.filled(with: 0, times: 10_000)
    let mut i = 0

    while i < 10_000 {
      let mut j = 0

      while j < 100_000 {
        a.set(i, a.get(i) + (j % n))
        j += 1
      }

      a.set(i, a.get(i) + r)
      i += 1
    }

    let _ = out.print(a.get(r).to_string)
  }
}

On my laptop this takes 24 seconds to run, with about 80% of the time being spent in the code of Int.%. Oddly enough, even if I just reduce that to _INKO.int_rem() it still takes more or less the same amount of time.

I'm not sure how on earth this code is that slow, given that Rust does it in about 2.5 seconds.

yorickpeterse · 2024-11-28T14:53:03Z

Curiously, the above program finishes in only 3.68 seconds on my desktop. Perhaps the Intel CPU on my laptop is just really terrible at this code for some reason?

Depending on how LLVM decides to optimize things, these attributes may help improve code generation, though it's difficult to say for certain how much at this stage. See #595 for more details. Changelog: performance

yorickpeterse added accepting contributions Issues that are suitable to be worked on by anybody, not just maintainers compiler Changes related to the compiler labels Jul 19, 2023

yorickpeterse mentioned this issue Nov 17, 2023

Add Inko jinyus/related_post_gen#440

Merged

yorickpeterse modified the milestones: 0.18.0, 0.19.0 Oct 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Figure out which LLVM optimisation passes are worth enabling #595

Figure out which LLVM optimisation passes are worth enabling #595

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Nov 17, 2023

yorickpeterse commented Nov 17, 2023

yorickpeterse commented Nov 23, 2024

yorickpeterse commented Nov 27, 2024

yorickpeterse commented Nov 28, 2024

Figure out which LLVM optimisation passes are worth enabling #595

Figure out which LLVM optimisation passes are worth enabling #595

Comments

yorickpeterse commented Jul 19, 2023

yorickpeterse commented Nov 17, 2023

yorickpeterse commented Nov 17, 2023

yorickpeterse commented Nov 23, 2024

yorickpeterse commented Nov 27, 2024

yorickpeterse commented Nov 28, 2024