-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] internal lld linker #36120
[WIP] internal lld linker #36120
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @eddyb (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
😮 Amazing! As for knowing what flags to pass:
So we might be able to borrow some detection code from EDIT: furthermore, we can do this in |
☔ The latest upstream changes (presumably #36117) made this pull request unmergeable. Please resolve the merge conflicts. |
I haven't read clang source code, but my guess is that it probably finds the system libraries by looking in relative paths from the location of the clang binary. So, basically
How would that work? If I build rustc on e.g. nixos, the library search path that gets hardcoded in liblibc would be We have to decide how to solve this system library search path before lld can be used to build std programs. So far, we have been relying on gcc; this has worked well so far, even when cross compiling because the cross linker (arm-linux-gnueabihf-gcc) tells us where the cross compiled system libraries are. But without gcc, we either have to ask the user to supply the system library search path for each target, which is annoying for the user, or build some elaborate mechanism to auto-detect where the system libraries are even when cross compiling, which seems like it would get complicated quickly and be prone to ambiguity -- what if there are multiple sysroot for the same target triple but slightly tweaked codegen options (+neon and -neon for instance). On another news, I tried to link a Rust program for x86_64-musl using lld but it errored with this message:
Haven't look further. |
@japaric The relative thing is what It'd work for distro-built packages, but not distribution via, e.g. |
Nice @japaric! It's actually pretty exciting to me to see how easy this is :) I agree that the next big question is whether to start landing this in tree. To me I think the threshold for that is that we have at least a test for this functionality, but once we have that I'd be fine fleshing out more platforms as they arise. I think it's definitely correct leaving this behind an unstable It's probably worth checking out the impact this has on the compiler though, just to make sure. For example:
cc @brson |
This is sweet. I think we should pursue this, as long is the implementation is cleanly separated from the existing linker code. Some things I'd suggest:
|
Well, right now it can successfully link x86_64 ELFs but the produced binaries segfault when executed. At least that's the case for Rust programs that link to libc. As shown above, a Rust program that doesn't link to libc executes correctly, but that program must executed under QEMU because it's bare metal. I don't know if just being able to link an executable passes the bar or if we should pursue integrating that bare metal program into our test suite; that would require qemu, perhaps a new test category (run-pass-qemu), etc.
Without this PR (stage1):
With this PR (stage1):
Good question! I don't have a windows/osx box to test. Perhaps we could throw this into the build bots (bors' try) and see if it works? Also, with the configure flag @brson suggested, we could, for now, simply not embed lld on the platforms where it doesn't compile.
That's a good idea. I'll try your suggestions in my next iteration. |
While this PR isn't doing it, I'd like to point out that I am strongly opposed to any move to remove the ability to use the system linker. LLD will never achieve feature parity with all the system linkers out there, link.exe being one of the biggest examples‡. Also it wouldn't really buy us anything for -msvc targets anyway, because we still rely on a lot of libraries that come with the msvc toolchain, so we always have a working system linker. ‡Unless Microsoft seriously steps up their open source game and adds all the stuff to LLD so that it can handle LTCG stuff as well as PDB debuginfo and all those store app specific features, and other details. I doubt this will happen, but then again Microsoft did do Clang/C2 so who knows. |
@retep998 |
Yeah I'd be fine throwing this to bors-dev at any point. @japaric I wonder if we could add a test for the linux MUSL case? E.g. some run-make test which is super gated and only runs in some situations, but that should in theory be our banner use case right? That is, if we have a standard library for x86_64 musl we should be always able to produce a binary. |
@retep998 Microsoft only needs to expose an API through which a gold or LLD plugin could be written. |
It doesn't work though ...
Nor with lld-3.9-rc3 (this PR) or with lld-4.0 (yesterday-ish HEAD). The only case that works so far is C free executables. |
@retep998 reports that lld can link Rust programs on msvc. And one can even use lld to link programs for msvc on Linux:
I'll try that locally and see if I can turn it into a test. |
Microsoft's link.exe LTCG option tells it to do "whole-program optimization", i.e. LTO, which LLD should eventually be able to do for msvc targets. |
@alexchandel Needs cooperation from Microsoft to actually invoke their equivalent of LLVM. |
Update: lld-3.9 can link Rust binaries cross compiled for the thumb*-none-eabi* targets. This means However, linker script support in lld-HEAD is already good enough that you can control the memory @alexcrichton is there a schedule for the next llvm update? I'd like to update this PR to use the |
Whoa, awesome! Currently there's no plans for the next LLVM update, but I don't think there's any particular reason to hold back either. Nowadays we just probably want to coordinate with emscripten to ensure that we update roughly at the same time. |
Done I've also added a rmake test that we can link a bare metal executable for the ARM Cortex M without having any cross toolchain installed. This test needs to be disabled when lld support is not built into rustc (th default), but I'm not sure how to do that. Do we have an ignore option for that? I'd like to throw this into the try buildbots but (a) this only works with rustbuild and (b) lld support is disabled by default. I'd have to toggle both rustbuild and lld to on by default before sending this to "try" but I'm not quite sure how to do that in the P.S. the implementation is still a mess :-) |
Not currently that I know of, but we proxy a bunch of random env vars down to compiletest so what's one more :)
Sure, I can throw to the bots. Wouldn't we want to enable lld by default though if we're building LLVM ourselves? |
Hi, an LLD dev here. If you find any mislinking bug, please report it to me or to bugs.llvm.org. I'm happy to help solving issues. |
Thanks for chipping in! I've had some problems linking programs with debuginfo enabled ("SHF_MERGE section not multiple of
Yes, I just pushed a commit that does that. If the try buildbots are using rustbuild (passing --enable-rustbuild to configure) then we can be sent this to bors try already. Otherwise, we would have to push another commit enabling rustbuild by default (withouth --enable-rustbuild)? (this is what I was referring to in my previous comment) |
There are a bunch of failures in dev, but they all look build related (not sure if anything to to tests) |
@japaric LLD has a convenient option to report a bug. If you add |
☔ The latest upstream changes (presumably #35021) made this pull request unmergeable. Please resolve the merge conflicts. |
@rui314 Wow, that's nifty. Will make use of it. Update: On Linux, I can now link Hello world for the host 🎉. But (a) I had to omit passing (The base of the diff is how rustc currently invokes lld when linking hello world) "ld.lld" \
"--as-needed" \
"-z" \
"noexecstack" \
"-L" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib" \
"hello.o" \
"-o" \
"hello" \
"--gc-sections" \
- "-pie" \
"-L" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib" \
"--Bstatic" \
"--Bdynamic" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libstd-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libpanic_unwind-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libunwind-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/librand-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcollections-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/librustc_unicode-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/liballoc_jemalloc-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/liblibc-411f48d3.rlib" \
"$sysroot/lib/rustlib/x86_64-unknown-linux-gnu/lib/libcore-411f48d3.rlib" \
"-l" \
"dl" \
"-l" \
"pthread" \
"-l" \
"gcc_s" \
"-l" \
"pthread" \
"-l" \
"c" \
"-l" \
"m" \
"-l" \
"rt" \
"-l" \
"util" \
"-l" \
- "compiler-rt"
+ "compiler-rt" \
+ -dynamic-linker \
+ /lib64/ld-linux-x86-64.so.2 \
+ -L/usr/lib/gcc/x86_64-linux-gnu/5 \
+ -L/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu \
+ /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crt1.o \
+ /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crti.o \
+ /usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu/crtn.o
When we use In the meantime, should we (a) temporarily disable -pie when lld is used (and perhaps issue a all:
rustc -C link-args="-dynamic-linker $A -L$B -L$C $C/crt1.o $C/crti.o $C/crtn.o" -Z use-lld hello.rs
./hello where $A, $B and $C are chosen to make this pass on the buildbots. Thoughts, @alexcrichton? |
@rui314 I reported this issue about 6 months ago: https://llvm.org/bugs/show_bug.cgi?id=27194 I haven't checked with latest to see if it's been fixed since, but it was never addressed in bug report afaik. |
Is there a reason to go for embedding LLD directly first, rather than teaching rustc to call it externally first, and then using the API once all the details for library paths etc have been worked out. |
The advantage is that by embedding lld we can start using this feature for two use cases
where using lld has the advantage of having to install one less thing: the cross toolchain (gcc and Also, for these two cases we don't need to decide how to handle the search of system libraries,
This would require adding special logic to |
@m4b LLD is three linkers in one executable. I'm the owner of COFF (Windows) and ELF (Unix) but Mach-O is maintained by other people. |
@japaric ok pushed to http://54.176.156.253/grid, may take a moment to show up though |
@japaric looks like the lld submodule may be busted? |
@alexcrichton sorry, I forgot to push my local lld changes. Could you try again? Or should I rebase this first? |
Looks like rustbuild tests passed but perhaps makefile ones didn't? |
(also problems on Windows) |
☔ The latest upstream changes (presumably #36456) made this pull request unmergeable. Please resolve the merge conflicts. |
@japaric what's the status of this? |
@cmr Waiting for a LLVM-UP (4.0) and for the upstream bugs to fix themselves so I get motivated enough to fix the WIndows build errors (yuck!). On a more serious note, I don't think I'll have time to work on this until the next month. |
Ok, closing this due to inactivity for now, but I'd be totally fine with a resubmission of this! |
-Z linker-flavor (Please read the commit message first) This PR is an alternative to #36120 (internal lld linker). The main goal of this PR is to make it *possible* to use LLD as a linker to allow out of tree experimentation. Now that LLD is going to be shipped with LLVM 4.0, it should become easier to get a hold of LLD (hopefully, it will be packaged by Linux distros soon). Since LLD is a multiarch linker, it has the potential to make cross compilation easier (less tools need to be installed). Supposedly, LLD is also faster than the gold linker so LLD may improve build times where link times are significant (e.g. 100% incremental compilation reuse). The place where LLD shines is at linking Rust programs that don't depend on system libraries. For example, here's how you would link a bare metal ARM Cortex-M program: ``` $ xargo rustc --target thumbv7m-none-eabi -- -Z linker-flavor=ld -C linker=ld.lld -Z print-link-args "ld.lld" \ "-L" \ "$XARGO_HOME/lib/rustlib/thumbv7m-none-eabi/lib" \ "$PWD/target/thumbv7m-none-eabi/debug/deps/app-de1f86df314ad68c.0.o" \ "-o" \ "$PWD/target/thumbv7m-none-eabi/debug/deps/app-de1f86df314ad68c" \ "--gc-sections" \ "-L" \ "$PWD/target/thumbv7m-none-eabi/debug/deps" \ "-L" \ "$PWD/target/debug/deps" \ "-L" \ "$XARGO_HOME/lib/rustlib/thumbv7m-none-eabi/lib" \ "-Bstatic" \ "-Bdynamic" \ "$XARGO_HOME/lib/rustlib/thumbv7m-none-eabi/lib/libcore-11670d2bd4951fa7.rlib" $ file target/thumbv7m-none-eabi/debug/app app: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically linked, not stripped, with debug_info ``` This doesn't require installing the `arm-none-eabi-gcc` toolchain. Even cooler (but I'm biased) is that you can link Rust programs that use [`steed`] (`steed` is a `std` re-implementation free of C dependencies for Linux systems) instead of `std` for a bunch of different architectures without having to install a single cross toolchain. [`steed`]: https://github.com/japaric/steed ``` $ xargo rustc --target aarch64-unknown-linux-steed --example hello --release -- -Z print-link-args "ld.lld" \ "-L" \ "$XARGO_HOME/lib/rustlib/aarch64-unknown-linux-steed/lib" \ "$PWD/target/aarch64-unknown-linux-steed/release/examples/hello-80c130ad884c0f8f.0.o" \ "-o" \ "$PWD/target/aarch64-unknown-linux-steed/release/examples/hello-80c130ad884c0f8f" \ "--gc-sections" \ "-L" \ "$PWD/target/aarch64-unknown-linux-steed/release/deps" \ "-L" \ "$PWD/target/release/deps" \ "-L" \ "$XARGO_HOME/lib/rustlib/aarch64-unknown-linux-steed/lib" \ "-Bstatic" \ "-Bdynamic" \ "/tmp/rustc.lAybk9Ltx93Q/libcompiler_builtins-589aede02de78434.rlib" $ file target/aarch64-unknown-linux-steed/release/examples/hello hello: ELF 64-bit LSB executable, ARM aarch64, version 1 (SYSV), statically linked, not stripped, with debug_info ``` All these targets (architectures) worked with LLD: - [aarch64-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/aarch64-unknown-linux-steed.json) - [arm-unknown-linux-steedeabi](https://github.com/japaric/steed/blob/lld/docker/arm-unknown-linux-steedeabi.json) - [arm-unknown-linux-steedeabihf](https://github.com/japaric/steed/blob/lld/docker/arm-unknown-linux-steedeabihf.json) - [armv7-unknown-linux-steedeabihf](https://github.com/japaric/steed/blob/lld/docker/armv7-unknown-linux-steedeabihf.json) - [i686-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/i686-unknown-linux-steed.json) - [mips-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/mips-unknown-linux-steed.json) - [mipsel-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/mipsel-unknown-linux-steed.json) - [powerpc-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/powerpc-unknown-linux-steed.json) - [powerpc64-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/powerpc64-unknown-linux-steed.json) - [x86_64-unknown-linux-steed](https://github.com/japaric/steed/blob/lld/docker/x86_64-unknown-linux-steed.json) --- The case where lld is unergonomic is linking binaries that depend on system libraries. Like "Hello, world" for `x86_64-unknown-linux-gnu`. Because you have to pass as linker arguments: the path to the startup objects, the path to the dynamic linker and the library search paths. And all those are system specific so they can't be encoded in the target itself. ``` $ cargo \ rustc \ --release \ -- \ -C \ linker=ld.lld \ -Z \ linker-flavor=ld \ -C \ link-args='-dynamic-linker /lib64/ld-linux-x86-64.so.2 -L/usr/lib -L/usr/lib/gcc/x86_64-pc-linux-gnu/6.3.1 /usr/lib/Scrt1.o /usr/lib/crti.o /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.1/crtbeginS.o /usr/lib/gcc/x86_64-pc-linux-gnu/6.3.1/crtendS.o /usr/lib/crtn.o' ``` --- Another case where `-Z linker-flavor` may come in handy is directly calling Solaris' linker which is also a multiarch linker (or so I have heard). cc @binarycrusader cc @alexcrichton Heads up: [breaking-change] due to changes in the target specification format.
@japaric Now that LLVM 4 has been merged, and rust already have some knowledge about lld |
I'd be very happy with |
this commit embeds lld, the LLVM linker, into rustc. If the
-Z use-lld
flag is passed to rustc, then rustc will use lld (via a library call),
instead of an external linker like cc, to link a Rust executable.
This has been tested in three different ways:
-pie
flag had to be omitted:Note that lld doesn't implicitly pass libraries (-lc) or startup
object (crt1.o) like cc does, so one has to manually pass these to lld
via
-C link-args
:Basically, I grabbed the "Hello, world" program from Intermezzos chapter 3 1
and turned it into a Cargo project 2 then proceeded to link it using lld.
This linked correctly and after turning the executable into a ISO image,
it also worked perfectly under QEMU!
This didn't require a toolchain (
arm-none-eabi-gcc
) or startup files(
newlib
). This test has been turned into a run-make test for this feature.Extra notes
I've only tested ELF executables, not dylibs, MachO or COFF.
lld supports ELFs, MachO and COFF formats. So in theory, with an
internal lld we could drop the dependency on a external linker pretty
much on all the platforms we support.
Relevant news: "lld now implements almost all of the functionality
necessary to function as a system linker for FreeBSD/amd64" -- FreeBSD
folk @ LLVM mailing list 3. The e-mail also mentions that
support for other archs is not quite ready for prime time.
The question is: Would it make sense to start slowly integrating lld
into rustc right now? It would land behind an unstable flag '-Z
use-lld' and we could print a warning message about it being
experimental whenever the flag is used.
Scenarios where it could be advantageous to not depend on a external
linker:
development (already shown to work), embedded (microcontroller) ARM
programs (once lld supports Thumb). Today, to build C-free Rust
programs you still need a C linker (cc). Would be nice to drop the
dependency on cc if you are not going to compile C code.
can cross compile ARM binaries without relying on anything external:
no need for arm-gcc or arm-glibc. Just
rustup target add $T
andcargo build --target $T
and you are done. This scenario hasn't beentested though.
Downsides:
support and features like linker scripts are not on parity with ld
yet. However, as mentioned above, the FreeBSD folks have had success
using it as a system linker on x86_64. At least, that architecture
appears to be reaching stability.
need to be keep in sync or lld won't compile. This cuts both ways: if
we want to upgrade lld to support some new feature or fix some bug,
it's likely we'll have to upgrade LLVM as well -- this possibly means
updating LLVM more often.
P.S. This a PoC; the implementation is a mess/hack. It would have to be
cleaned up before merging it.
cc #9367
cc @alexcrichton @brson