Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustc 1.48.0+ generates branchier and less compact code for integer division w/ boundary checks #99961

Closed
mqudsi opened this issue Jul 30, 2022 · 1 comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@mqudsi
Copy link
Contributor

mqudsi commented Jul 30, 2022

The following code, which correctly bypasses overflow and divide by zero conditions, generates branchier and less optimal machine code (under x86_64) when compiled under rustc 1.48.0+ as compared to rustc 1.46.0 and below:

pub fn checked_div_i64(dividend: i64, divisor: i64) -> Option<i64> {
    if dividend > i64::min_value() && divisor != 0 {
            Some(dividend / divisor)
    } else {
        None
    }
}

Godbolt comparison link

The assembly generated in 1.46.0:

example::checked_div_i64:
        xor     eax, eax
        movabs  rcx, -9223372036854775808
        cmp     rdi, rcx
        je      .LBB0_3
        test    rsi, rsi
        je      .LBB0_3
        mov     rax, rdi
        cqo
        idiv    rsi
        mov     rdx, rax
        mov     eax, 1
.LBB0_3:
        ret

vs in 1.48.0:

example::checked_div_i64:
  xor eax, eax
  movabs rcx, -9223372036854775808
  cmp rdi, rcx
  je .LBB0_6
  test rsi, rsi
  je .LBB0_6
  mov rax, rdi
  or rax, rsi
  shr rax, 32
  je .LBB0_3
  mov rax, rdi
  cqo
  idiv rsi
  mov rdx, rax
  jmp .LBB0_5
.LBB0_3:
  mov eax, edi
  xor edx, edx
  div esi
  mov edx, eax
.LBB0_5:
  mov eax, 1
.LBB0_6:
  ret

The astute will observe that rustc 1.47.0 was skipped in the good/bad demarcation above. That's because rustc 1.47.0 introduced an incorrect checked division w/ panic (which is now happening again). My gut feeling is that whatever fixed 1.47's dismal codegen resulted in a still-suboptimal solution as compared to what we had before.

The problem still persists in the latest nightlies, but is obscured by #99960.

@rustbot label +regression-from-stable-to-stable +A-codegen +A-llvm +T-compiler

@mqudsi mqudsi added the C-bug Category: This is a bug. label Jul 30, 2022
@rustbot rustbot added A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. I-prioritize Issue: Indicates that prioritization has been requested for this issue. labels Jul 30, 2022
@mqudsi
Copy link
Contributor Author

mqudsi commented Jul 30, 2022

After the discussion in #99960, I'm going to close this because it's just tracking the minutiae of LLVM's optimization changes between LLVM 10 and LLVM 11, and the penalty isn't that big of a deal (compared to generating the panic code). There are some minor differences in the unoptimized IR but I don't think they're responsible.

@rustbot label -I-prioritize

@mqudsi mqudsi closed this as completed Jul 30, 2022
@rustbot rustbot removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Jul 30, 2022
@workingjubilee workingjubilee closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. regression-from-stable-to-stable Performance or correctness regression from one stable version to another. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants