Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Landing pad accesses corrupted register #112943

Open
purplesyringa opened this issue Oct 18, 2024 · 0 comments
Open

Landing pad accesses corrupted register #112943

purplesyringa opened this issue Oct 18, 2024 · 0 comments

Comments

@purplesyringa
Copy link

LLVM currently assumes that if a function preserves the value of a register, this register can be safely accessed from a landing pad even if the function unwinds.

For example (x86_64):

#include <stdio.h>

__attribute__((ms_abi, noinline)) void f() { throw 1; }

struct Dropper {
  int x;
  ~Dropper() { printf("Dropper{%d}\n", x); }
};

__attribute__((noinline)) void test(int arg) {
  Dropper dropper{arg};
  f();
}

int main() try { test(1); } catch (...) {
}

f() is marked as ms_abi, where rdi is callee-saved, so the value of arg can be read out from rdi after invoking f(). That's what LLVM does both if f() returns and in test's cleanup pad.

Unfortunately, that's incorrect behavior. According to the Itanium C++ ABI, the landing pad can only rely on the registers that are callee-saved by the base ABI. For Linux, "base ABI" is the SysV ABI, where rdi is caller-saved, so the unwinding library is not required to restore rdi.

While LLVM's libunwind goes beyond the EH ABI requirements and restores rdi, libgcc indeed doesn't (which I'd argue is compliant with the standard). When compiled against libgcc, the above code produces Dropper{random garbage} instead of Dropper{1}.


I've tested this on clang 18.1.8 from the Arch repos, built against libgcc. It compiles the test function from above to

       │ test(int)
  1200 │     53                      push   rbx
  1201 │     48 83 ec 20             sub    rsp, 0x20
  1205 │     89 fe                   mov    esi, edi
  1207 │     e8 84 ff ff ff          call   f()
  120c │     48 89 c3                mov    rbx, rax
  120f │     48 8d 3d ee 0d 00 00    lea    rdi, [rel _IO_stdin_used+0x4]        
  1216 │     31 c0                   xor    eax, eax
  1218 │     e8 13 fe ff ff          call   printf@plt
  121d │     48 89 df                mov    rdi, rbx
  1220 │     e8 5b fe ff ff          call   _Unwind_Resume@plt
  1225 │     66 66 2e 0f 1f 84 00    data16 cs nop word [rax+rax]
       │     00 00 00 00

which erroneously assumes rsi is retained rather than rdi, but it's the same bug anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant