Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: GOARM64=v8.1 crash with illegal hardware instruction on v8.6 #71411

Closed
sylr opened this issue Jan 23, 2025 · 16 comments
Closed

runtime: GOARM64=v8.1 crash with illegal hardware instruction on v8.6 #71411

sylr opened this issue Jan 23, 2025 · 16 comments
Labels
arch-arm64 BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Milestone

Comments

@sylr
Copy link

sylr commented Jan 23, 2025

Go version

go1.24-608acff84

Output of go env in your module/workspace:

AR='ar'
CC='clang'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='clang++'
GCCGO='gccgo'
GO111MODULE=''
GOARCH='arm64'
GOARM64='v8.0'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/Users/sylvain/Library/Caches/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/Users/sylvain/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/jq/lq71x93965l422wp3h87r1lr0000gn/T/go-build807651234=/tmp/go-build -gno-record-gcc-switches -fno-common'
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMOD='/Users/sylvain/git/nats-server/go.mod'
GOMODCACHE='/Users/sylvain/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/sylvain/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/Users/sylvain/sdk/gotip'
GOSUMDB='sum.golang.org'
GOTELEMETRY='on'
GOTELEMETRYDIR='/Users/sylvain/Library/Application Support/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/Users/sylvain/sdk/gotip/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='devel go1.24-608acff84 Wed Jan 22 10:13:04 2025 -0800'
GOWORK='/Users/sylvain/git/nats-server/go.work'
PKG_CONFIG='pkg-config'

What did you do?

I tried to compile a darmin/arm64 binary and specifying a GOARM64 version my Macbook Air M2 should support.

What did you see happen?

 sylvain@tragicomix > ~/git/nats-server > main ± > GOARM64=v8.1 gotip build .
 sylvain@tragicomix > ~/git/nats-server > main ± > ./nats-server
[1]    53704 illegal hardware instruction  ./nats-server

 ✘ sylvain@tragicomix > ~/git/nats-server > main ± > lldb ./nats-server
(lldb) target create "./nats-server"
Current executable set to '/Users/sylvain/git/nats-server/nats-server' (arm64).
(lldb) r
Process 53773 launched: '/Users/sylvain/git/nats-server/nats-server' (arm64)
Process 53773 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0xd5380600)
    frame #0: 0x00000001000744d0 nats-server`runtime.rt0_go.abi0 + 176
nats-server`runtime.rt0_go.abi0:
->  0x1000744d0 <+176>: mrs    x0, ID_AA64ISAR0_EL1
    0x1000744d4 <+180>: lsr    x0, x0, #20
    0x1000744d8 <+184>: and    x0, x0, #0xf
    0x1000744dc <+188>: cbz    x0, 0x100074528 ; <+264>
Target 0: (nats-server) stopped.

What did you expect to see?

$ ./nats-server
[52200] 2025/01/23 21:47:40.650150 [INF] Starting nats-server
[52200] 2025/01/23 21:47:40.650302 [INF]   Version:  2.11.0-dev
[52200] 2025/01/23 21:47:40.650305 [INF]   Git:      [257319c]
[52200] 2025/01/23 21:47:40.650308 [INF]   Name:     NCVSZ2X5FXDLUSYVXB7BTOHAQO4HQ4F54BSZXBR2MNVTPU5CMLNAI5ES
[52200] 2025/01/23 21:47:40.650311 [INF]   ID:       NCVSZ2X5FXDLUSYVXB7BTOHAQO4HQ4F54BSZXBR2MNVTPU5CMLNAI5ES
[52200] 2025/01/23 21:47:40.652035 [INF] Listening for client connections on 0.0.0.0:4222
[52200] 2025/01/23 21:47:40.652285 [INF] Server is ready
^C[52200] 2025/01/23 21:47:42.070510 [INF] Initiating Shutdown...
[52200] 2025/01/23 21:47:42.070923 [INF] Server Exiting..
@prattmic prattmic changed the title GOARM64=v8.1 gotip build, illegal hardware instruction runtime: GOARM64=v8.1 crash with illegal hardware instruction on v8.6 Jan 23, 2025
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 23, 2025
@prattmic prattmic added NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. release-blocker arch-arm64 and removed compiler/runtime Issues related to the Go compiler and/or runtime. labels Jan 23, 2025
@prattmic prattmic added this to the Go1.24 milestone Jan 23, 2025
@prattmic prattmic added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jan 23, 2025
@prattmic
Copy link
Member

prattmic commented Jan 23, 2025

The crash is on MRS ID_AA64ISAR0_EL1, R0 at https://cs.opensource.google/go/go/+/master:src/runtime/asm_arm64.s;l=89;drc=4e7025860129b33f704634583d20539af19c344b.

Added in https://go.dev/cl/610195.

I feel like I'm missing something, doesn't _EL1 mean that this register is only readable from kernel mode?

cc @golang/runtime @golang/arm

@prattmic
Copy link
Member

I do seem to be missing something because GOARM64=v8.1 is fine on a linux-arm64 gomote.

@gabyhelp gabyhelp added the BugReport Issues describing a possible bug in the Go implementation. label Jan 23, 2025
@cherrymui
Copy link
Member

I think some OS kernels emulate the read of ID_AA64ISAR0_EL1. But apparently darwin kernel doesn't. We probably should not use it on darwin.

@prattmic
Copy link
Member

IIUC, Linux emulates this instruction: https://elixir.bootlin.com/linux/v6.12.6/source/arch/arm64/kernel/cpufeature.c#L3653

This instruction is Op0 = 0x3, CRn = 0x0, Op1 = 0x0, CRm = 6.

Probably safest to only use this instruction on OSes that we are sure emulate it, rather than slowly adding a denylist as bugs come in?

@randall77
Copy link
Contributor

I think we should just revert 610195 and try again next cycle.

@prattmic
Copy link
Member

For future reference, there do seem to be sysctls on Darwin to check for CPU features: https://github.com/apple-oss-distributions/xnu/blob/8d741a5de7ff4191bf97d57b9f54c2f6d4a15585/bsd/kern/kern_mib.c#L1185

Though I'm unsure if we'd manage to successfully run long enough to be able to call into libc before crashing from a bad instruction.

@cherrymui
Copy link
Member

Yeah, we use sysctls on Darwin in the internal/cpu package. Instead of checking the CPU features in assembly code, perhaps we just check it against what we got from internal/cpu, and use only the lowest level instructions in runtime startup before the check, if that is possible.

@mauri870
Copy link
Member

Does anyone managed to look into a fix? It seems that we have to shift logic around a bit to do a proper fix since this assembly code runs before any runtime init, so I would rather revert it as Keith suggested.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/644695 mentions this issue: Revert "runtime: Check LSE support on ARM64 at runtime init"

@andreybokhanko
Copy link
Contributor

The crash is on MRS ID_AA64ISAR0_EL1, R0 at https://cs.opensource.google/go/go/+/master:src/runtime/asm_arm64.s;l=89;drc=4e7025860129b33f704634583d20539af19c344b.

Added in https://go.dev/cl/610195.

I feel like I'm missing something, doesn't _EL1 mean that this register is only readable from kernel mode?

cc @golang/runtime @golang/arm

Hi @prattmic , I'm the original author of the offending code, so perhaps it makes sense to add me to CC list of related bugs? Thanks!

@andreybokhanko
Copy link
Contributor

I think some OS kernels emulate the read of ID_AA64ISAR0_EL1. But apparently darwin kernel doesn't. We probably should not use it on darwin.

@cherrymui , there is a similar note from @4a6f656c during the original code review: https://go-review.googlesource.com/c/go/+/610195/10#message-d03c129bec9642058969f7ee21b46a178ee18c1b.

We have two options: either #ifndef not only OpenBSD, but all BSD flavors, including Darwin, as well -- or #ifdef it only for Linux. What would you prefer?

@prattmic
Copy link
Member

@andreybokhanko, apologies, I indeed should have CC'd you when I found https://go.dev/cl/610195.

We have two options: either #ifndef not only OpenBSD, but all BSD flavors, including Darwin, as well -- or #ifdef it only for Linux. What would you prefer?

If we're going to continue using assembly, I think #ifdef enabling it only for OSes that we know for sure do emulation is the way to go. That said, I agree with @cherrymui that it would be nice if we could use internal/cpu for feature checks, since that package already knows how to get the features. Though as we use more and more advanced features in the compiler, it seems like it might be hard to ensure that we don't use any unsupported features long enough to get to internal/cpu.

@cherrymui
Copy link
Member

I was thinking if possible to ensure the compiler to use just the basic ISA to compile the startup functions, either by having a hard-coded list of functions, or using some (internal) pragma. They only run during startup, so it doesn't need to be very fast, using the newest CPU features.

If we want to keep using assembly, perhaps start with #ifdef on Linux and FreeBSD (which seems to work, at least the internal/cpu package reads ID_AA64ISAR0_EL1). We also want to think about how to check other features if we make use of more in the future.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/645795 mentions this issue: runtime: Check LSE support on ARM64 at runtime init

@andreybokhanko
Copy link
Contributor

I was thinking if possible to ensure the compiler to use just the basic ISA to compile the startup functions, either by having a hard-coded list of functions, or using some (internal) pragma. They only run during startup, so it doesn't need to be very fast, using the newest CPU features.

If we want to keep using assembly, perhaps start with #ifdef on Linux and FreeBSD (which seems to work, at least the internal/cpu package reads ID_AA64ISAR0_EL1). We also want to think about how to check other features if we make use of more in the future.

TBH, I don't feel comfortable enough to change the whole approach to startup checks, so let me choose the easier way (#ifdef for Linux and FreeBSD) for the fix.

https://go-review.googlesource.com/c/go/+/645795 submitted; please take a look.

@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 BugReport Issues describing a possible bug in the Go implementation. compiler/runtime Issues related to the Go compiler and/or runtime. NeedsFix The path to resolution is known, but the work has not been done. release-blocker
Projects
None yet
Development

No branches or pull requests

9 participants