Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not unroll loops on ReleaseSmall #7077

Closed

Conversation

jedisct1
Copy link
Contributor

Trivial fix for #7048, verified to work on the given test case.

Fixes #7048

Trivial fix for ziglang#7048, verified to work on the given test case.

Fixes ziglang#7048
@LemonBoy
Copy link
Contributor

LemonBoy commented Nov 11, 2020

Well well well, I think the proper solution would be to introduce a -fno-unroll-loops (for gentoo haters) so that people who really need it and people who think they're smarter than the compiler can toggle the option in the pipeline.

The cooler solution would be to inhibit the unrolling only for specific loops with some kind of annotation, much better than disabling this optimization for the whole compilation unit.

@jedisct1
Copy link
Contributor Author

Do we really want to introduce more knobs? The intent of ReleaseSmall currently seems to be to produce the smallest possible code (thus the opinionated choice of -Oz and not -Os), and in this context, not unrolling loops by default would make sense.

How about introducing an intermediate size reduction level (-Os equivalent, without loop unrolling being disabled) instead?

@jedisct1
Copy link
Contributor Author

Annotations to enable/disable unrolling of individual loops (similar to #pragma unroll(x)) independently from the global settings would indeed be neat!

@LemonBoy
Copy link
Contributor

Do we really want to introduce more knobs? The intent of ReleaseSmall currently seems to be to produce the smallest possible code (thus the opinionated choice of -Oz and not -Os), and in this context, not unrolling loops by default would make sense.

Well the performance impact on the generated code may be non-negligible and changing how ReleaseSmall behaves leaves the user with no way to opt-out of this change.

How about introducing an intermediate size reduction level (-Os equivalent, without loop unrolling being disabled) instead?

That may be a good compromise.

@jedisct1
Copy link
Contributor Author

The hardest part would be to name that new optimization level.

@andrewrk
Copy link
Member

The cooler solution would be to inhibit the unrolling only for specific loops with some kind of annotation, much better than disabling this optimization for the whole compilation unit.

I believe this is #978.

I'm opposed to adding a new release mode or more global knobs. Is there a real world use case to consider that is not satisfied with this patch + the planned #978?

@jedisct1
Copy link
Contributor Author

-Os is a perfect fit for WebAssembly, as code generators and runtimes currently have limited optimization passes, yet size is important.

In ran quite a few benchmarks ([1] [2]) a while back on multiple WebAssembly runtimes; -O2 and -Os had very similar performance, but having smaller modules to download is an obvious benefit.

@@ -218,7 +218,7 @@ bool ZigLLVMTargetMachineEmitToFile(LLVMTargetMachineRef targ_machine_ref, LLVMM
PMBuilder->SizeLevel = is_small ? 2 : 0;

PMBuilder->DisableTailCalls = is_debug;
PMBuilder->DisableUnrollLoops = is_debug;
PMBuilder->DisableUnrollLoops = is_small | is_debug;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why | instead of ||?

LemonBoy added a commit to LemonBoy/zig that referenced this pull request Nov 12, 2020
Let's follow what Clang does for -Oz and apply the `minsize` and
`optsize` attributes by default.

Closes ziglang#7048
Supersedes ziglang#7077
@LemonBoy
Copy link
Contributor

I believe this is #978.

That proposal has some flaws:

  • LLVM only allows optsize and optnone, meaning you can only disable the optimization pass for a given function,
  • The minimum granularity ensured by the backends is usually a function, not arbitrary chunks of code,
  • The idea here was to let the user fine-tune how the loop is unrolled, having @optimizeFor(.ReleaseFast) do it automagically is probably not fine-grained enough (and, as stated before, you can't make LLVM optimize a single function only)

What I had in mind was something like the following:

fn @setScopeOptions(opt: union(enum) {
    // function-level options, they can only be applied in a fn scope
    Func: struct {
        stack_alignment: ?usize // Replaces setStackAlign
        enable_stack_canaries: bool,
        runtime_safety: bool, // Can also replace setRuntimeSafety
        optimize_hint: enum { None, Compact, NoOptimizations }, // Tentative naming, don't kill me
        // Other options go here
    },
    Loop: struct {
        force_unroll: bool,
        force_unroll_count: usize,
    },
    // Who knows what else we may need
}) void;

This way we can drop a few special-cased builtins and allow us to introduce as many new options we want with ease.

@jedisct1 jedisct1 closed this Nov 12, 2020
andrewrk pushed a commit that referenced this pull request Nov 13, 2020
Let's follow what Clang does for -Oz and apply the `minsize` and
`optsize` attributes by default.

Closes #7048
Supersedes #7077
andrewrk pushed a commit that referenced this pull request Nov 13, 2020
Let's follow what Clang does for -Oz and apply the `minsize` and
`optsize` attributes by default.

Closes #7048
Supersedes #7077
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ReleaseSmall unrolling loops
4 participants