Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: add GOARM64=v8.1 and so on #60905

Closed
aktau opened this issue Jun 20, 2023 · 30 comments
Closed

all: add GOARM64=v8.1 and so on #60905

aktau opened this issue Jun 20, 2023 · 30 comments
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Milestone

Comments

@aktau
Copy link
Contributor

aktau commented Jun 20, 2023

Similar to: #45453.

Each ARM sub-architecture iteration adds some new instructions, some of which may be useful for Go. For example, ARMv8.2 introduced:

  • v8.2: FEAT_LSE, Large System Extensions (though it looks like there have been changes to this extension in later iterations)

These instructions are already used after a capability check (thanks @prattmic for pointing this out). These capability check jumps predict perfectly, but the blocks aren't tiny either (godbolt):

image

It might be useful to define some new possible GOARM values akin to GOAMD to elide these checks, potentially increasing performance and reducing binary sizes by some tiny amount. I have not done any performance measurements, but received hearsay from colleagues that using these instructions was a significant win for them (but I did not ask whether they used a capability check or whether they were comparing with always using the less capable version).

Other potential candidate:

  • v8.1: FEAT_CRC32, Changes to CRC32 instructions (made mandatory)
  • v8.8: FEAT_MOPS, Standardization of memory operations, slides). runtime.memmove is >1% of cycles when looking at data available here. Using FEAT_MOPS unconditionally could also have good icache effects.
  • v8.8: FEAT_HBC, Hinted conditional branches (perhaps indented branches or anything that can be proven to contain a non-nil error via dataflow could be hinted "unlikely").
@aktau aktau added the Proposal label Jun 20, 2023
@prattmic prattmic added the compiler/runtime Issues related to the Go compiler and/or runtime. label Jun 20, 2023
@aktau aktau changed the title all: add GOARM selectors for ARMv8.2 all: add GOARM selectors for ARMv8.1 (et al.) Jun 20, 2023
@ianlancetaylor ianlancetaylor changed the title all: add GOARM selectors for ARMv8.1 (et al.) proposal: all: add GOARM selectors for ARMv8.1 (et al.) Jun 20, 2023
@ianlancetaylor
Copy link
Member

The currently valid values for GOARM are 5, 6, and 7. Are you suggesting that we add 8 and 8.1 as valid values?

@ianlancetaylor ianlancetaylor moved this to Incoming in Proposals Jun 20, 2023
@gopherbot gopherbot added this to the Proposal milestone Jun 20, 2023
@Jorropo
Copy link
Member

Jorropo commented Jun 21, 2023

It would be surprising to have GOARM used both by GOARCH=arm and GOARCH=arm64 imo should be a GOARM64.

@ianlancetaylor
Copy link
Member

Good point, thanks, it should be GOARM64.

That may mean that we don't need to be compatible with GOARM, and could use v8, v8.1, ....

@rsc rsc changed the title proposal: all: add GOARM selectors for ARMv8.1 (et al.) proposal: all: add GOARM64=v8.1 and so on Jun 21, 2023
@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

The leading v seems like it emphasizes that this is the ARMv8.1 v numbers not any other kind of ARM numbering (like ARM11 which implements ARMv6). It also makes sure that nothing tries to strconv.Atoi("8") and then get messed up by strconv.Atoi("8.1"). So I think we should keep the leading v.

@rsc
Copy link
Contributor

rsc commented Jun 21, 2023

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@rsc rsc moved this from Incoming to Active in Proposals Jun 21, 2023
@scweng
Copy link

scweng commented Jun 22, 2023

I would like to sound extra support for retaining v and emphasize that it's not a number -- a number is not enough to represent Arm capabilities.

Currently Arm defines Armv8.0~Armv8.9 and Armv9.0~Armv9.3, but Armv9.0 is not the next version of Armv8.9, it actually branched off Armv8.5. Parsing the version into number will create an illusion of linear progression.

Further, each of these versions are actually "extensions." Each extension defines a number of "features," e.g. FEAT_LSE, with some being mandatory and some optional. Some, like LSE, are optional up to a version (8.0) and then become mandatory later (8.1).

Because of this, both gcc and clang have complicated ways to precisely specify what Arm version to target: https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html#index-march-2, https://lists.llvm.org/pipermail/llvm-dev/2018-September/126346.html

In this system, the compiler cannot assume the target machine has LSE instructions if we only compile with -march=armv8-a and thus needs to emit capability tests. But -march=armv8-a+lse means targeting a machine that implements all the mandatory features defined in the base v8 ISA, but also assume the machine has LSE, which would be enough for the compiler to skip the capability test and use instructions like LDADD directly.

My suggestion is we also implement similar notations in GOARM64 so that GOARM64=v8.1-a and GOARM64=v8-a+lse both suffice in using LSE instructions directly.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Is there a list of all the ARM64 versions somewhere, so we know what to accept?

As far as adding other modifiers, the other architectures have set the precedent of using comma-separated lists, so it would be GOARM64=v8.1,lse not v8.1+lse.

@prattmic
Copy link
Member

I believe the canonical source is the ARM Architecture Reference Manual. Chapter A2 describes the versions. It is not at all consise; a summary table would be nice, but it does describe the requirements if you dive into the details. e.g., "FEAT_SB, Speculation Barrier ... This feature is OPTIONAL in Armv8.0 implementations and mandatory in Armv8.5 implementations."

The full list of features is very long. I'm personally a bit skeptical about having GOARM64 accept the full set when we'll likely only adjust compilation based on a few of them.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

Sorry, I wasn't asking for the full set of features, only for the full set of version numbers, which I hope is much smaller.

@rsc
Copy link
Contributor

rsc commented Jun 28, 2023

We need the full set so we know which (older) build tags to enable when we have GOARM64=v8.3 for example.

@prattmic
Copy link
Member

Those are easier! From the same chapter, I believe the full set of versions is v8.0 through v8.9 and v9 [1] through v9.3.

[1] The document uses the name v8.0 for the first v8 version, but v9 (not v9.0) for the first v9 version. I'm not sure if that is an intentional difference or not.

@prattmic
Copy link
Member

prattmic commented Jun 28, 2023

It is also worth noting that v8 and v9 overlap.

  • v9 implements v8.5
  • v9.1 implements v8.6
  • v9.2 implements v8.7
  • v9.3 implements v8.8

(I don't know why there is no v9.4 for v8.9...) Edit: https://community.arm.com/arm-community-blogs/b/architectures-and-processors-blog/posts/arm-a-profile-architecture-2022 mentions a v9.4 to accompany v8.9, but it seems to be missing from the manual.

i.e., the v9 build should not include the v8.6 build tag.

@rsc
Copy link
Contributor

rsc commented Jul 5, 2023

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc rsc moved this from Active to Likely Accept in Proposals Jul 5, 2023
@rsc
Copy link
Contributor

rsc commented Jul 12, 2023

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc moved this from Likely Accept to Accepted in Proposals Jul 12, 2023
@rsc rsc changed the title proposal: all: add GOARM64=v8.1 and so on all: add GOARM64=v8.1 and so on Jul 12, 2023
@rsc rsc modified the milestones: Proposal, Backlog Jul 12, 2023
@andreybokhanko
Copy link
Contributor

I have first version of a patch adding support for GOARM64 + unconditional atomics for ARM8.1/LSE ready. Unless someone else is also happen to work on this at the moment and ready to publish his/her changes (@mauri870 ?), let me get stub on this and publish my patch early next week (I still need to comb it before public code review; also, I will split the patch to two: one just adding GOARM64 and another that actually uses GOARM64 for atomics). @mknyszek , please remove "help wanted" label.

For reference, "go test -bench" expectedly shows huge gains for runtime/internal/atomic package:

goos: linux
goarch: arm64
pkg: runtime/internal/atomic
                 │    old.txt    │                 new.txt                 │
                 │    sec/op     │    sec/op      vs base                  │
AtomicLoad64-96    0.4814n ± ∞ ¹   0.7701n ± ∞ ¹         ~ (p=1.000 n=1) ²
AtomicStore64-96    3.851n ± ∞ ¹    3.851n ± ∞ ¹         ~ (p=1.000 n=1) ³
AtomicLoad-96      0.7698n ± ∞ ¹   0.4812n ± ∞ ¹         ~ (p=1.000 n=1) ²
AtomicStore-96      3.851n ± ∞ ¹    3.850n ± ∞ ¹         ~ (p=1.000 n=1) ²
And8-96            10.010n ± ∞ ¹    7.314n ± ∞ ¹         ~ (p=1.000 n=1) ²
And-96             10.010n ± ∞ ¹    7.313n ± ∞ ¹         ~ (p=1.000 n=1) ²
And8Parallel-96     47.28n ± ∞ ¹   308.70n ± ∞ ¹         ~ (p=1.000 n=1) ²
AndParallel-96      44.21n ± ∞ ¹   246.80n ± ∞ ¹         ~ (p=1.000 n=1) ²
Or8-96             10.010n ± ∞ ¹    7.315n ± ∞ ¹         ~ (p=1.000 n=1) ²
Or-96              10.010n ± ∞ ¹    7.319n ± ∞ ¹         ~ (p=1.000 n=1) ²
Or8Parallel-96      42.97n ± ∞ ¹   247.20n ± ∞ ¹         ~ (p=1.000 n=1) ²
OrParallel-96       43.32n ± ∞ ¹   248.20n ± ∞ ¹         ~ (p=1.000 n=1) ²
Xadd-96             42.79n ± ∞ ¹   243.20n ± ∞ ¹         ~ (p=1.000 n=1) ²
Xadd64-96           42.45n ± ∞ ¹   242.40n ± ∞ ¹         ~ (p=1.000 n=1) ²
Cas-96              85.03n ± ∞ ¹    93.75n ± ∞ ¹         ~ (p=1.000 n=1) ²
Cas64-96            84.99n ± ∞ ¹    91.80n ± ∞ ¹         ~ (p=1.000 n=1) ²
Xchg-96             42.45n ± ∞ ¹   242.50n ± ∞ ¹         ~ (p=1.000 n=1) ²
Xchg64-96           42.38n ± ∞ ¹   243.30n ± ∞ ¹         ~ (p=1.000 n=1) ²
geomean             16.06n          33.04n        +105.69%
¹ need >= 6 samples for confidence interval at level 0.95
² need >= 4 samples to detect a difference at alpha level 0.05
³ all samples are equal

but more importantly, this patch shows repeatable +4.6% performance gain for a real-world application (etcd) on my ARM64 server (Kunpeng920 chip).

@mauri870
Copy link
Member

Thanks for your efforts, @andreybokhanko! I've only begun collecting some notes and haven't started working on an implementation. I'll be happy to review yours!

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/559555 mentions this issue: cmd/dist,internal: add ARM64 environment variable

@andreybokhanko
Copy link
Contributor

https://go-review.googlesource.com/c/go/+/559555 submitted.

@rsc @mauri870 Please kindly review.

@andreybokhanko
Copy link
Contributor

andreybokhanko commented Feb 8, 2024

Change https://go.dev/cl/559555 mentions this issue: cmd/dist,internal: add ARM64 environment variable

Note that current code review discussion leans towards dropping support for ",lse" and ",crypto", as nobody gave an example of real ARM hardware that implements LSE and not fully supports 8.1 -- as well as real ARM hardware that implements crypto and not fully supports 8.2.

gopherbot pushed a commit that referenced this issue Mar 6, 2024
Adds GOARM64 environment variable with accepted range of values "v8.{0-9}",
"v9.{0-5}" and optional ",lse" and ",crypto" suffixes.

Right now it doesn't affect anything, but can be used in the future to
selectively target specific versions of different ARM64 hardware.

For #60905

Change-Id: I6d530041b6931aa884e34f719f8ec41b1cb03ece
Reviewed-on: https://go-review.googlesource.com/c/go/+/559555
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Mauri de Souza Meneguzzo <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
Reviewed-by: Shu-Chun Weng <[email protected]>
Reviewed-by: Fannie Zhang <[email protected]>
@andreybokhanko
Copy link
Contributor

A fix is landed to mainline (https://go-review.googlesource.com/c/go/+/559555), so this can be closed.

@ALTree
Copy link
Member

ALTree commented Mar 6, 2024

Assuming this issue only tracked the addition of the GOARM64 variable, it looks like it can be closed.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/591875 mentions this issue: doc/next: document GOARM64 environment variable

gopherbot pushed a commit that referenced this issue Jun 11, 2024
For #65614.
Updates #60905.

Change-Id: I2dd9df3c7066357cf06268d918bad3c255b38aed
Reviewed-on: https://go-review.googlesource.com/c/go/+/591875
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Joel Sing <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/610195 mentions this issue: runtime: Check ARM64 architecture at runtime

gopherbot pushed a commit that referenced this issue Oct 22, 2024
Check presence of LSE support on ARM64 chip if we targeted it at compile time.

Related to #69124
Update #60905

Change-Id: I6fe244decbb4982548982e1f88376847721a33c7
Reviewed-on: https://go-review.googlesource.com/c/go/+/610195
Reviewed-by: Cherry Mui <[email protected]>
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Shu-Chun Weng <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/644695 mentions this issue: Revert "runtime: Check LSE support on ARM64 at runtime init"

gopherbot pushed a commit that referenced this issue Jan 27, 2025
This reverts CL 610195.

Reason for revert: SIGILL on macOS. See issue #71411.

Updates #69124, #60905.
Fixes #71411.

Change-Id: Ie0624e516dfb32fb13563327bcd7f557e5cba940
Reviewed-on: https://go-review.googlesource.com/c/go/+/644695
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
Reviewed-by: Keith Randall <[email protected]>
Reviewed-by: Mauri de Souza Meneguzzo <[email protected]>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/645795 mentions this issue: runtime: Check LSE support on ARM64 at runtime init

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm64 compiler/runtime Issues related to the Go compiler and/or runtime. Proposal Proposal-Accepted
Projects
Status: Accepted
Development

No branches or pull requests