-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
all: add GOARM64=v8.1 and so on #60905
Comments
The currently valid values for |
It would be surprising to have |
Good point, thanks, it should be That may mean that we don't need to be compatible with |
The leading v seems like it emphasizes that this is the ARMv8.1 v numbers not any other kind of ARM numbering (like ARM11 which implements ARMv6). It also makes sure that nothing tries to strconv.Atoi("8") and then get messed up by strconv.Atoi("8.1"). So I think we should keep the leading v. |
This proposal has been added to the active column of the proposals project |
I would like to sound extra support for retaining v and emphasize that it's not a number -- a number is not enough to represent Arm capabilities. Currently Arm defines Armv8.0~Armv8.9 and Armv9.0~Armv9.3, but Armv9.0 is not the next version of Armv8.9, it actually branched off Armv8.5. Parsing the version into number will create an illusion of linear progression. Further, each of these versions are actually "extensions." Each extension defines a number of "features," e.g. Because of this, both gcc and clang have complicated ways to precisely specify what Arm version to target: https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#index-march-2, https://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html In this system, the compiler cannot assume the target machine has LSE instructions if we only compile with My suggestion is we also implement similar notations in |
Is there a list of all the ARM64 versions somewhere, so we know what to accept? As far as adding other modifiers, the other architectures have set the precedent of using comma-separated lists, so it would be GOARM64=v8.1,lse not v8.1+lse. |
I believe the canonical source is the ARM Architecture Reference Manual. Chapter A2 describes the versions. It is not at all consise; a summary table would be nice, but it does describe the requirements if you dive into the details. e.g., "FEAT_SB, Speculation Barrier ... This feature is OPTIONAL in Armv8.0 implementations and mandatory in Armv8.5 implementations." The full list of features is very long. I'm personally a bit skeptical about having GOARM64 accept the full set when we'll likely only adjust compilation based on a few of them. |
Sorry, I wasn't asking for the full set of features, only for the full set of version numbers, which I hope is much smaller. |
We need the full set so we know which (older) build tags to enable when we have GOARM64=v8.3 for example. |
Those are easier! From the same chapter, I believe the full set of versions is v8.0 through v8.9 and v9 [1] through v9.3. [1] The document uses the name v8.0 for the first v8 version, but v9 (not v9.0) for the first v9 version. I'm not sure if that is an intentional difference or not. |
It is also worth noting that v8 and v9 overlap.
(I don't know why there is no v9.4 for v8.9...) Edit: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 mentions a v9.4 to accompany v8.9, but it seems to be missing from the manual. i.e., the v9 build should not include the v8.6 build tag. |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. 🎉 |
I have first version of a patch adding support for GOARM64 + unconditional atomics for ARM8.1/LSE ready. Unless someone else is also happen to work on this at the moment and ready to publish his/her changes (@mauri870 ?), let me get stub on this and publish my patch early next week (I still need to comb it before public code review; also, I will split the patch to two: one just adding GOARM64 and another that actually uses GOARM64 for atomics). @mknyszek , please remove "help wanted" label. For reference, "go test -bench" expectedly shows huge gains for
but more importantly, this patch shows repeatable +4.6% performance gain for a real-world application (etcd) on my ARM64 server (Kunpeng920 chip). |
Thanks for your efforts, @andreybokhanko! I've only begun collecting some notes and haven't started working on an implementation. I'll be happy to review yours! |
Change https://go.dev/cl/559555 mentions this issue: |
Note that current code review discussion leans towards dropping support for ",lse" and ",crypto", as nobody gave an example of real ARM hardware that implements LSE and not fully supports 8.1 -- as well as real ARM hardware that implements crypto and not fully supports 8.2. |
Adds GOARM64 environment variable with accepted range of values "v8.{0-9}", "v9.{0-5}" and optional ",lse" and ",crypto" suffixes. Right now it doesn't affect anything, but can be used in the future to selectively target specific versions of different ARM64 hardware. For #60905 Change-Id: I6d530041b6931aa884e34f719f8ec41b1cb03ece Reviewed-on: https://go-review.googlesource.com/c/go/+/559555 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Mauri de Souza Meneguzzo <[email protected]> Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Shu-Chun Weng <[email protected]> Reviewed-by: Fannie Zhang <[email protected]>
A fix is landed to mainline (https://go-review.googlesource.com/c/go/+/559555), so this can be closed. |
Assuming this issue only tracked the addition of the GOARM64 variable, it looks like it can be closed. |
Change https://go.dev/cl/591875 mentions this issue: |
For #65614. Updates #60905. Change-Id: I2dd9df3c7066357cf06268d918bad3c255b38aed Reviewed-on: https://go-review.googlesource.com/c/go/+/591875 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Dmitri Shuralyov <[email protected]> Reviewed-by: Joel Sing <[email protected]>
Change https://go.dev/cl/610195 mentions this issue: |
Check presence of LSE support on ARM64 chip if we targeted it at compile time. Related to #69124 Update #60905 Change-Id: I6fe244decbb4982548982e1f88376847721a33c7 Reviewed-on: https://go-review.googlesource.com/c/go/+/610195 Reviewed-by: Cherry Mui <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Shu-Chun Weng <[email protected]>
Change https://go.dev/cl/644695 mentions this issue: |
This reverts CL 610195. Reason for revert: SIGILL on macOS. See issue #71411. Updates #69124, #60905. Fixes #71411. Change-Id: Ie0624e516dfb32fb13563327bcd7f557e5cba940 Reviewed-on: https://go-review.googlesource.com/c/go/+/644695 LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Keith Randall <[email protected]> Reviewed-by: Mauri de Souza Meneguzzo <[email protected]>
Change https://go.dev/cl/645795 mentions this issue: |
Similar to: #45453.
Each ARM sub-architecture iteration adds some new instructions, some of which may be useful for Go. For example, ARMv8.2 introduced:
LDADD
(atomic add), useful foratomic.Add*
.LDSETAL
(atomic or), see sync/atomic: add OR/AND operators for unsigned types #61395.CAS
(compare and swap), useful foratomic.CompareAndSwap*
SWP
(swap), useful foratomic.Swap*
These instructions are already used after a capability check (thanks @prattmic for pointing this out). These capability check jumps predict perfectly, but the blocks aren't tiny either (godbolt):
It might be useful to define some new possible
GOARM
values akin toGOAMD
to elide these checks, potentially increasing performance and reducing binary sizes by some tiny amount. I have not done any performance measurements, but received hearsay from colleagues that using these instructions was a significant win for them (but I did not ask whether they used a capability check or whether they were comparing with always using the less capable version).Other potential candidate:
runtime.memmove
is >1% of cycles when looking at data available here. UsingFEAT_MOPS
unconditionally could also have good icache effects.The text was updated successfully, but these errors were encountered: