Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.note.rustc dylib section flagged SHF_ALLOC on Linux (can cause crashes) #26764

Closed
tylerwhall opened this issue Jul 3, 2015 · 5 comments · Fixed by #35409
Closed

.note.rustc dylib section flagged SHF_ALLOC on Linux (can cause crashes) #26764

tylerwhall opened this issue Jul 3, 2015 · 5 comments · Fixed by #35409
Labels
O-linux Operating system: Linux T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tylerwhall
Copy link

Having the metadata note section flagged "alloc" can cause two types of crashes in certain environments.

1. Dynamic loader crash

Glibc's dynamic loader will try to allocate and copy the entire section onto the stack which will crash if the stack size is limited. This section is particularly large (~2MB for libstd) - not something that can generally be expected to fit on the stack.

The loader is searching for .note.ABI-tag, but does so by reading in any PT_NOTE headers. Because the metadata section is flagged alloc, it gets an entry in the program header, causing it to get loaded by this code.
http://osxr.org/glibc/source/elf/dl-load.c?!v=glibc-2.19#1869 (Note the alloca() at 1879)

I think this may be the cause of #22528, or other crashes with open_verify() in the call stack.

A workaround of using objcopy -R to remove the section fixes this problem, but leaves the object in a brittle state, described below.

2. Crashes in stripped and prelinked shared objects

Working around 1. by stripping out .note.rustc causes a problem with section numbering. Pulling the section out from before another ALLOC section (.bss in the example below) causes that section to be renumbered. Neither GNU strip nor objcopy fix the rest of the references in the executable that point to the section by its index. In particular, entries in the .dynamic section now point to the next section (.comment, here) which breaks when prelinking.

Prelink decides not to fix up the symbols in the dynamic section because it thinks they point to .comment, a section that didn't get relocated during prelinking. So BSS gets relocated, but the dynamic symbols point to the non-prelinked address, causing the program to segfault when it tries to access exported symbols in BSS.

readelf -S libstd-4e7c5e5c.so (current behavior)

  [25] .data             PROGBITS        001938a0 18b8a0 0001a0 00  WA  0   0 16
  [26] .note.rustc       NOTE            00193a40 18ba40 2088d2 00  WA  0   0 16
  [27] .bss              NOBITS          0039c320 394312 000dc8 00  WA  0   0 16
  [28] .comment          PROGBITS        00000000 394312 000011 01  MS  0   0  1
  [29] .ARM.attributes   ARM_ATTRIBUTES  00000000 394323 00003d 00      0   0  1
  [30] .debug_aranges    PROGBITS        00000000 394360 0011a0 00      0   0  8

To fix this, I'm currently using this patch as a workaround.
tylerwhall@fc91059
It runs objcopy on the intermediate metadata object to remove the alloc and write flags. This causes the final link to place the metadata section at the end of the executable and not generate a program header entry.

The fixed version has the metadata with no flags set, no load address, and located near the end of the file along with the debug sections where it can be safely stripped.

readelf -S libstd-4e7c5e5c.so (desired behavior)

  [25] .data             PROGBITS        001938a0 18b8a0 0001a0 00  WA  0   0 16
  [26] .bss              NOBITS          00193a40 18ba40 000dc8 00  WA  0   0 16
  [27] .comment          PROGBITS        00000000 18ba40 000011 01  MS  0   0  1
  [28] .ARM.attributes   ARM_ATTRIBUTES  00000000 18ba51 00003d 00      0   0  1
  [29] .note.rustc       PROGBITS        00000000 18ba90 2088a7 00      0   0 16
  [30] .debug_aranges    PROGBITS        00000000 394338 0011a0 00      0   0  8

Obviously running objcopy like this is not portable. Calling llvm::LLVMSetGlobalConstant() on the metadata variable is enough to remove the SHF_WRITE flag, but I couldn't find a way with LLVM's API to output a section that didn't request to be loaded at run time. Hopefully someone who knows more about LLVM can come up with a more elegant solution to remove SHF_ALLOC from the intermediate metadata object.

@sfackler sfackler added O-linux Operating system: Linux T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Jul 3, 2015
@eefriedman
Copy link
Contributor

You could use module-level inline asm; not sure if that qualifies as "elegant".

@benpye
Copy link

benpye commented May 5, 2016

Just to add, this is currently affecting WSL too, after applying the patch given (updated for current master), rust seems to run fine there. I equally can't find any way of specifying not to apply the SHF_ALLOC flag to the section with the LLVM API...

@eddyb
Copy link
Member

eddyb commented Jun 1, 2016

To expand on that: LLVM internally removes SHF_ALLOC if SectionKind::getMetadata() is used but nothing "user"-facing touches it, only debuginfo and miscellaneous hardcoded sections 😞.

@eddyb
Copy link
Member

eddyb commented Jun 1, 2016

My testcase so far:

#![crate_type="dylib"]
// Any large (>8MB AFAICT) file works, librustc_llvm happens to take up 44MB.
pub const BLOB: &'static [u8] =
include_bytes!("build/x86_64-unknown-linux-gnu/stage0/lib/librustc_llvm-a4922355.so");

Running ldd on the result gives me ldd: exited with unknown exit code (139).

@cuviper
Copy link
Member

cuviper commented Jul 28, 2016

FYI, I filed a glibc bug about the improper alloca:
https://sourceware.org/bugzilla/show_bug.cgi?id=20419

bors added a commit that referenced this issue Aug 14, 2016
[MIR] Add Storage{Live,Dead} statements to emit llvm.lifetime.{start,end}.

Storage live ranges are tracked for all MIR variables and temporaries with a drop scope.
`StorageLive` is lowered to `llvm.lifetime.start` and `StorageDead` to `llvm.lifetime.end`.

There are some improvements possible here, such as:
* pack multiple storage liveness statements by using the index of first local + `u64` bitset
* enforce that locals are not directly accessed outside their storage live range
* shrink storage live ranges for never-borrowed locals to initialization -> last use
* emit storage liveness statements for *all* temporaries
 * however, the remaining ones are *always* SSA immediates, so they'd be noop in MIR trans
 * could have a flag on the temporary that its storage is irrelevant (a la C's old `register`)
   * would also deny borrows if necessary
    * this seems like an overcompliation and with packing & optimizations it may be pointless

Even in the current state, it helps stage2 `rustc` compile `boiler` without overflowing (see #35408).

A later addition fixes #26764 and closes #27372 by emitting `.section` directives for dylib metadata to avoid them being allocated into memory or read as `.note`. For this PR, those bugs were tripping valgrind.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-linux Operating system: Linux T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants