Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missed Optimization: Inefficient Handling of Mutable Reference in Simple Conditional Assignment #124346

Open
paolobarbolini opened this issue Apr 24, 2024 · 5 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code.

Comments

@paolobarbolini
Copy link
Contributor

The following example shows how LLVM isn't able to optimize trivial branches when dealing with &mut pointers, while it can with owned values. I'd expect both functions to override current_len with 0 without needing to branch.

#[no_mangle]
pub fn unoptimized(current_len: &mut u32) {
    if *current_len != 0 {
        *current_len = 0;
    }
}

#[no_mangle]
pub fn optimized(mut current_len: u32) -> u32 {
    if current_len != 0 {
        current_len = 0;
    }
    current_len
}
unoptimized:
        cmp     dword ptr [rdi], 0
        je      .LBB0_2
        mov     dword ptr [rdi], 0
.LBB0_2:
        ret

optimized:
        xor     eax, eax
        ret
define void @unoptimized(ptr noalias nocapture noundef align 4 dereferenceable(4) %current_len) unnamed_addr {
start:
  %_2 = load i32, ptr %current_len, align 4
  %0 = icmp eq i32 %_2, 0
  br i1 %0, label %bb3, label %bb1

bb1:
  store i32 0, ptr %current_len, align 4
  br label %bb3

bb3:
  ret void
}

define noundef i32 @optimized(i32 noundef %0) unnamed_addr {
start:
  ret i32 0
}
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 24, 2024
@nikic nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code. and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Apr 25, 2024
@nikic
Copy link
Contributor

nikic commented Apr 25, 2024

Whether this optimization is legal is still an open question, but the current answer seems to be leaning to "no". See #114886 for some entry points to that discussion. (cc @RalfJung)

@RalfJung
Copy link
Member

Cc @rust-lang/opsem

@RalfJung
Copy link
Member

RalfJung commented Apr 26, 2024

Yeah, under some of the considered models for Rust references (in particular, Tree Borrows), if a function never writes to a mutable reference, then the compiler cannot make it write. This makes some unsafe code a lot easier to reason about, greatly reducing the risk of UB. But it also means we can't hoist the write out of the if in this example.

rust-lang/unsafe-code-guidelines#133 is probably the best UCG issue for this question.

@taralx
Copy link
Contributor

taralx commented Apr 27, 2024

I'm not clear on what the optimizer is supposed to do here. Unconditionally write?

@comex
Copy link
Contributor

comex commented Apr 27, 2024

An unconditional write is smaller and often faster, but it's not always faster. If the cache line is not already dirty, writing will dirty it, which requires the CPU to get an exclusive lock on the cache line (could be slow if there's contention with other cores) and also requires the line to eventually be written back to main memory (using up memory throughput).

This doesn't apply to the second example function, where the argument is passed and returned in a register.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such I-slow Issue: Problems and improvements with respect to performance of generated code.
Projects
None yet
Development

No branches or pull requests

6 participants