Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal order of tests in match #117970

Open
Kmeakin opened this issue Nov 16, 2023 · 4 comments
Open

Suboptimal order of tests in match #117970

Kmeakin opened this issue Nov 16, 2023 · 4 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@Kmeakin
Copy link
Contributor

Kmeakin commented Nov 16, 2023

I tried this code (godbolt):

extern "Rust" {
    fn foo() -> u32;
    fn bar() -> u32;
    fn baz() -> u32;
}

pub fn src(x: u32, y: u32) -> u32 {
    unsafe {
        match (x, y) {
            (1, 10) => foo(),
            (2, 10) => bar(),
            (3, 10) => baz(),
            _ => 0,
        }
    }
}

pub fn tgt(x: u32, y: u32) -> u32 {
    unsafe {
        match (y, x) {
            (10, 1) => foo(),
            (10, 2) => bar(),
            (10, 3) => baz(),
            _ => 0,
        }
    }
}

I expected to see this happen:
src should produce assembly as efficient as tgt, since they are both equivalent on all inputs. tgt first checks y against 10, then checks x against 1, 2 and 3:

example::tgt:
        cmp     w1, #10
        b.ne    .LBB1_5
        cmp     w0, #3
        b.eq    .LBB1_6
        cmp     w0, #2
        b.eq    .LBB1_7
        cmp     w0, #1
        b.ne    .LBB1_5
        b       foo
.LBB1_5:
        mov     w0, wzr
        ret
.LBB1_6:
        b       baz
.LBB1_7:
        b       bar

Instead, this happened:
src checks x against 3, then checks y against 10, then checks x against 2, then checks y against 10, then checks x against 1, then checks y against 10:

example::src:
        cmp     w0, #3
        b.eq    .LBB0_5
        cmp     w0, #2
        b.eq    .LBB0_7
        cmp     w0, #1
        b.ne    .LBB0_9
        cmp     w1, #10
        b.ne    .LBB0_9
        b       foo
.LBB0_5:
        cmp     w1, #10
        b.ne    .LBB0_9
        b       baz
.LBB0_7:
        cmp     w1, #10
        b.ne    .LBB0_9
        b       bar
.LBB0_9:
        mov     w0, wzr
        ret

Meta

rustc --version --verbose:

rustc 1.76.0-nightly (6b771f6b5 2023-11-15)
binary: rustc
commit-hash: 6b771f6b5a6c8b03b6322a9c77ac77cb346148f0
commit-date: 2023-11-15
host: x86_64-unknown-linux-gnu
release: 1.76.0-nightly
LLVM version: 17.0.5
Backtrace

<backtrace>

@Kmeakin Kmeakin added the C-bug Category: This is a bug. label Nov 16, 2023
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 16, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Nov 16, 2023

@rustbot label I-slow

@rustbot rustbot added the I-slow Issue: Problems and improvements with respect to performance of generated code. label Nov 16, 2023
@saethlin
Copy link
Member

alive2 approves: https://alive2.llvm.org/ce/z/NLickb

@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Nov 16, 2023
@Kmeakin
Copy link
Contributor Author

Kmeakin commented Nov 16, 2023

This could be fixed by better heuristics used by rustc when compiling match to MIR

https://dl.acm.org/doi/pdf/10.1145/1411304.1411311, section 8 gives some heuristics that can be used in selecting the order of tests

@DianQK
Copy link
Member

DianQK commented Feb 23, 2024

#121397 will fix this issue, but I'm not sure if it is a special case.

bors added a commit to rust-lang-ci/rust that referenced this issue Feb 29, 2024
… r=<try>

 Re-enable the early otherwise branch optimization

Fixes rust-lang#95162. Fixes rust-lang#119014. Fixes rust-lang#117970.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement.

It should not be UB that we pass in an invalid enum discriminant when calling a function, this is more like LLVM's poison value. UB only when used. Although miri would consider the following code to be UB. (It's fine for miri.)

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=18602870aaeb07cbdf7dfcd2c28961a2

I extended the scenario with scalars and the same target values.

r? compiler
bors added a commit to rust-lang-ci/rust that referenced this issue Mar 3, 2024
… r=<try>

 Re-enable the early otherwise branch optimization

Fixes rust-lang#95162. Fixes rust-lang#119014. Fixes rust-lang#117970.

An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement.

It should not be UB that we pass in an invalid enum discriminant when calling a function, this is more like LLVM's poison value. UB only when used. Although miri would consider the following code to be UB. (It's fine for miri.)

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=18602870aaeb07cbdf7dfcd2c28961a2

I extended the scenario with scalars and the same target values.

r? compiler
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-mir-opt Area: MIR optimizations C-bug Category: This is a bug. I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants