-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: cgo call to symbol from library loaded dynamically will panic with go 1.21.1 and ld >2.38 #63264
Comments
I tried building the binary in my reproduction repro with Go 1.20:
Go 1.21:
So in this case it didn't even show up as an undefined symbol in Go 1.21.1. I think this would explain why it panics in Go 1.21; in Go 1.20 the symbol is there as an undefined symbol, which is why it works with a |
Tried it in an Ubuntu 20.04 VM (
|
I'm trying to rule out
By checking with
However, running the binary still results in a panic. So I guess the difference in symbols in the executable isn't the root cause here. |
|
Actually, this works. In the last comment, I was testing by just trying to call the symbol and not loading the library. However, with |
In my reproduction repro, I ran the |
I debugged two binaries built with CGO outputIn the generated CGO output, the generated C function
Go 1.21At the line
Go 1.20In
|
Added a new experiment in the reproduction repo where I wrote a small C program that attempts to get symbol resolution the same way that worked in When compiled with When compiled with I'm sure CGO's version of "call this function from the header" is different than C's version of "call this function from the header", although when I look at the Admittedly it does seem off to me that a Either way, it is very strange that |
@golang/compiler |
On my Rolling Debian Testing machine, I did a Go 1.20.8
Go 1.21.1
In the Go 1.21.1 dump, the generated cgo binding generates a |
Could |
The
I do think it's the PLT. So I guess perhaps in the PLT itself this unresolved symbol doesn't have a section like I expect it would and perhaps that explains why |
I have now confirmed that
|
This code has me suspicious, however I tried it with the go/src/cmd/link/internal/amd64/asm.go Lines 248 to 263 in 5351bcf
|
I'm new to actually working with the Go codebase. I tried to add some |
It seems that the code from |
Stupid question: if you are loading up a library using dlopen() already, why not just use "dlsym" to find the address of the function you are interested in and call it that way? FYI one thing that I think can help when working on these sorts of problems us to use the Go linker's "-tmpdir" option. Example: $ rm -rf /tmp/xxx The object files in /tmp/xxx are going to be the ones passed to the external linker in the final step, so it is a good spot where you can inspect them (both Go and C objects to see what's going on). |
No, it is a good question. This is generally the best way to do this and what I would do if I wrote it myself.
Great idea, thank you! I didn't notice this flag when looking through options. I'll give that a try. |
A The result of my issue seems to be here: go/src/cmd/link/internal/ld/lib.go Lines 1682 to 1691 in 122b35e
When I forced this into the old behaviour (always adding -rdynamic to argv ) my reproduction worked as expected. So in my reproduction, I tried adding -Wl,--export-dynamic and building with go1.21.1 it worked.
So I'm tempted to say this isn't really a bug. This is just a strange behaviour in this particular case when I suppose I'll ping @ianlancetaylor in case he's interested since it was his change, but looking at the original issue from the change I think it makes sense to stay the way it is now (at least based on what I understand). So I'm going to suggest to the I will now close this issue. Thanks Than for the suggestions! |
Good detective work @braydonk . Yeah in retrospect the export dynamic change would seem to make sense given what you described. |
Thanks for digging into this. |
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
I created a minimal reproduction setup at https://github.com/braydonk/cgo_dl_repro
In this scenario, I have a header file that references a single function
get42
that I will get from a shared object, which I will load at runtime withdlopen
. Theld
flags-Wl,--unresolved-symbols=ignore-in-object-files
are used.First, I run
make liblib
, which will compile the C file in this repo that implements theget42
function and then turn it into a shared object.Then I run
go run .
What did you expect to see?
In
go1.20.8
, and ingo1.21.1
withld
version2.34
, I get the expected result:What did you see instead?
In
go1.21
with anld
version >2.38
I get a panic:Additional Info
This seems to be a result of how CGO handles
--unresolved-symbols=ignore-in-object-files
. The unresolved symbol results inSIGSEGV
because the address of the symbols is0x0
. Ingo1.20.8
when I completely eschew thedlopen
step and just try to callC.get42()
without loading anything, I get an unresolved symbol lookup error:However in
go1.21.1
, I get a panic identical to calling it after loading the library.Different
ld
versionsMy setup for testing the different
ld
versions was actually by changing distros entirely. I have my personal machine which is on a Rolling Debian Testing distro, and VMs on Debian Bullseye (11), Ubuntu Jammy (22.04), and Ubuntu Focal (20.04). The panic ingo1.21.1
occurs on every OS expect Ubuntu Focal, and the only difference I could think of was the lowerld
version, which is why I have called that out, BUT technically there could be some other secret difference that is causing this which I missed.Why the strange setup?
This setup case may seem very oddly specific. I am mirroring the setup used by NVIDIA's Go NVML bindings; we discovered this error through our usage of that library. See NVIDIA/go-nvml#36, particularly you'll want to scroll down to the newest comments which talk about how this specific breakage happened after upgrading to
go1.21.1
.The text was updated successfully, but these errors were encountered: