Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

panic branch does not get removed in presence of other function (that also does not get it removed) #119923

Open
antonilol opened this issue Jan 13, 2024 · 4 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@antonilol
Copy link
Contributor

I tried this code (compiler explorer):

#[derive(Clone, Copy)]
pub enum Enum {
    None,
    Variant1,
    Variant2,
}

impl Enum {
    #[no_mangle]
    pub fn as_str_1(self) -> &'static str {
        // switch table for reference
        match self {
            Self::None => "",
            Self::Variant1 => "1",
            Self::Variant2 => "2",
        }
    }

    #[no_mangle]
    pub fn as_str_2(self) -> &'static str {
        let len = (self as usize + 1) / 2;
        let offs = self as usize / 2;
        &"12"[offs..len + offs]
    }

    #[no_mangle]
    pub fn as_str_3(self) -> &'static str {
        let len = (self as usize + 1) / 2;
        let offs = self as usize;
        &" 12"[offs..len + offs]
    }
}

and this code (compiler explorer):

#[derive(Clone, Copy)]
pub enum Enum {
    None,
    Variant1,
    Variant2,
}

impl Enum {
    #[no_mangle]
    pub fn as_str_1(self) -> &'static str {
        // switch table for reference
        match self {
            Self::None => "",
            Self::Variant1 => "1",
            Self::Variant2 => "2",
        }
    }

    #[no_mangle]
    pub fn as_str_2(self) -> &'static str {
        let len = (self as usize + 1) / 2;
        let offs = self as usize / 2;
        &"12"[offs..len + offs]
    }
}

(difference between the two is that as_str_3 only exists in the first example)

I expected to see this happen:
all as_str_ functions compile to roughly the same code (no memory accesses, just some pointer addition and getting the length right)

something like this (compiler explorer):

as_str_4:
        movzx   edx, dil
        lea     rax, [rip + .L__unnamed_3]
        add     rax, rdx
        inc     edx
        shr     edx
        ret

.L__unnamed_3:
        .ascii  " 12"

Instead, this happened:
as_str_1 became a switch table, so i pointed the compiler in the right direction with as_str_2, i tried a few other things, as_str_3 with an integer division less but a byte longer string and as_str_4 with unsafe code to get the desired output.
as_str_2's panic branch only got removed when i commented as_str_3 out, as_str_3 always had one

Meta

rustc --version --verbose:

rustc 1.77.0-nightly (62d7ed4a6 2024-01-11)
binary: rustc
commit-hash: 62d7ed4a6775c4490e493093ca98ef7c215b835b
commit-date: 2024-01-11
host: x86_64-unknown-linux-gnu
release: 1.77.0-nightly
LLVM version: 17.0.6
@antonilol antonilol added the C-bug Category: This is a bug. label Jan 13, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Jan 13, 2024
@saethlin
Copy link
Member

saethlin commented Jan 13, 2024

The reason these are different is that in the well-optimized case, LLVM does IPSCCP (Interprocedural Sparse Conditional Constant Propagation) on str::is_char_boundary, before it gets inlined into the SliceIndex impl which gets inlined into as_str_2.

Good: https://godbolt.org/z/Y5nsTqvMd
Bad: https://godbolt.org/z/MnPKq57qM

That's more an explanation of why the behavior here is so strange; there's an interprocedural optimization in play. The relevant optimization should still be possible post-inlining.

@saethlin saethlin added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-heavy Issue: Problems and improvements with respect to binary size of generated code. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Jan 13, 2024
@saethlin
Copy link
Member

It looks like turning up the MIR inliner is sufficient to generate the good code: https://godbolt.org/z/6dY6feWxq I can see that we inline the SliceIndex impl in MIR, probably that just exposes all the information to LLVM earlier in the optimization pipeline.

@antonilol
Copy link
Contributor Author

It looks like turning up the MIR inliner is sufficient to generate the good code: https://godbolt.org/z/6dY6feWxq I can see that we inline the SliceIndex impl in MIR, probably that just exposes all the information to LLVM earlier in the optimization pipeline.

i see, but not for as_str_3, and the -Z options only work in nightly

The reason these are different is that in the well-optimized case, LLVM does IPSCCP (Interprocedural Sparse Conditional Constant Propagation) on str::is_char_boundary, before it gets inlined into the SliceIndex impl which gets inlined into as_str_2.

so iiuc, there are too much function boundaries (at least str::is_char_boundary and SliceIndex::get) where IPSCCP can not get through?
looks like that could be fixed with a few #[inline]s, but this could have undesirable effects for performance of other code.

and regarding as_str_1, would it be possible to add an optimization in rustc (for match statements that produce &str or &[u8] in general) to do pointer arithmetic instead of switch tables? simple integer switch tables could also be avoided with a bit of math (compiler explorer)

@saethlin
Copy link
Member

i see, but not for as_str_3, and the -Z options only work in nightly

I'm not offering a solution here, just diagnosing the source of the problem so it's easier for someone else to pick this up.

looks like that could be fixed with a few #[inline]s, but this could have undesirable effects for performance of other code.

The relevant functions are already #[inline]. I think the problem is that the relevant optimization in LLVM is only done before inlining, so you can either get it by inlining in rustc or by only having one caller and relying on an interprocedural optimization.

and regarding as_str_1, would it be possible to add an optimization in rustc (for match statements that produce &str or &[u8] in general) to do pointer arithmetic instead of switch tables? simple integer switch tables could also be avoided with a bit of math (compiler explorer)

We have a handful of optimizations around match that were merged then disabled when soundness holes were found. Nobody has fixed them. rustc just doesn't have a good IR for implementing optimizations right now; I think fixing the missed optimization(s) in LLVM is a much better use of effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-heavy Issue: Problems and improvements with respect to binary size of generated code. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants