Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename ARMv8 to AArch64 #139

Open
gnzlbg opened this issue Oct 22, 2017 · 15 comments
Open

Rename ARMv8 to AArch64 #139

gnzlbg opened this issue Oct 22, 2017 · 15 comments
Labels

Comments

@gnzlbg
Copy link
Contributor

gnzlbg commented Oct 22, 2017

LLVM calls ARMv8 AArch64. Some ARM documents call it A64 or v8. For consistency with LLVM we should probably rename it to aarch64. Thoughts? @parched @japaric

@petrochenkov
Copy link

petrochenkov commented Oct 22, 2017

LLVM calls ARMv8 AArch64. Some ARM documents call it A64

<pedantry>

All the listed names mean different things.

  • ARMv8 - ARM architecture version 8
    • ARMv8-A - Architectural profile inside ARMv8
      • AArch64 - Execution state inside ARMv8-A (execution on one core can switch between different execution states, e.g. between AArch64 OS and AArch32 application, using exceptions).
        • A64 - Instruction set
      • AArch32 - Execution state inside ARMv8-A.
        • A32 - Instruction set (execution in one execution state can switch between different instruction sets using jumps).
        • T32 - Instruction set
    • ARMv8-M, ARMv8-R - ...

</pedantry>

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 22, 2017

Currently, we have the arm intrinsics in the v6, v7, and v8 modules +- neon (v7_neon, v8_neon), would maybe splitting v8 into A32 and A64 make sense?

@parched
Copy link

parched commented Oct 22, 2017

Why not just 2 modules, scalar and vector/simd/neon?

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 22, 2017

I should rephrase the question. First, the modules are private. Second, we use them only for conditional compilation, that is, for deciding which intrinsics to provide depending on the target triple.

@parched
Copy link

parched commented Oct 23, 2017

Ok understood. It's probably not possible to use the target triple for that as they don't have to follow convention. Custom targets can have any name and even some of the builtin targets don't follow a convention. e.g., all the arm-* targets have no version and armv7-linux-androideabi generates Thumb not Arm code. I guess you could special case for every builtin target but custom targets would still be an issue.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 23, 2017

AFAIK LLVM cannot emit code that contains aarch64 instructions but is able to run on v6/v7 if those instructions are not reached.

This means that run-time feature detection cannot be used to generate an, e.g., armv7 binary that uses aarch64 instructions if they are available at run-time.

I think that what I will do is move the v8 and v8_neon modules out of the arm architecture into the aarch64 architecture module to make this distinction more clear.

@japaric this doesn't fit good with the reality of the ARMv8 specification but fits better with what users can actually do with Rust right now. @parched thoughts?

@parched
Copy link

parched commented Oct 24, 2017

@gnzlbg Yes looking at the current v8 module those instructions are specifically AArch64 only so probably shouldn't be under a general v8 module. However, those functions can be generated on any architecture/version, they would just use different instructions, so maybe you could not have them under any conditional compilation.

Some other general notes having had a look now,

  • CLZ is available in Armv5 in the Arm state but not Thumb state.
  • A lot of these functions aren't SIMD related, is the goal of this crate to expose all vendor intrinsics including non-SIMD?
  • They all start with an underscore, is there a reason for that?
  • Everything is declared unsafe even when they don't need to be.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 24, 2017

However, those functions can be generated on any architecture/version, they would just use different instructions, so maybe you could not have them under any conditional compilation.

I agree but by design the functions in this crate "guarantee" that a specific instruction is generated, and should only be available on the architectures in which they are available.

A higher-level crate built on top of this one (or maybe not for those functions) could do this.

A lot of these functions aren't SIMD related, is the goal of this crate to expose all vendor intrinsics including non-SIMD?

Yes, the goal for this crate is to expose all vendor intrinsics including non-SIMD ones that we want to use in stable at some point. If you take a look at the x86 modules you will find the abm, tbm, bmi, and bmi2 sub-modules exposing many bit manipulation intrinsics. Although this crate is called stdsimd, these intrinsics will be all exported together in core::vendor. (@BurntSushi @alexcrichton maybe we should rename the crate to stdvendor for the next release).

They all start with an underscore, is there a reason for that?

I followed the name of the C headers / ARM reference. If I made a mistake on some intrinsic please fill an issue and I'll fix it. The names are not set in stone, but we should have a good idea about them before writing an RFC to stabilize the intrinsics.

Everything is declared unsafe even when they don't need to be.

The idea is to use #[target_feature = "+..."] to force the compiler to emit the intrinsic although we are lacking in this regard for some of the ARM intrinsics.

For example, consider what happens when calling _rbit_u32, which uses #[target_feature = "+v7"], on ARMv5.

Since LLVM and downwards regard this as undefined behavior, we agreed to make the API of all intrinsic unsafe in general. If some architecture always guarantee the support for an intrinsic we should consider on a case-by-case basis whether we make them safe (just fill an issue).

@parched
Copy link

parched commented Oct 25, 2017

I agree but by design the functions in this crate "guarantee" that a specific instruction is generated, and should only be available on the architectures in which they are available.

Ok sure, FWIW the Arm instrinsics do for some and not for others [ACLE]

I followed the name of the C headers / ARM reference.

Most of the Arm instrinics start with double underscore but that's only because there's no namespacing in C. I think you can drop the leading double underscore for Rust.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 25, 2017

So this is the refactor that I plan to do:

  • arm/ architecture module (available if arch is either arm or aarch64):
    • v5.rs v5 sub-module (always available)
    • v6.rs v6 sub-module (always available*)
    • v7.rs v7 sub-module (available is cfg_target_feature = "v7")
    • neon.rs neon sub-module (available is cfg_target_feature = "neon")
  • aarch64/ architecture module (available if arch is aarch64)
    • a64.rs A64 sub-module (always available)
    • neon.rs neon sub-module (available is cfg_target_feature = "neon")

[*] ideally we should switch v6 on only when cfg_target_feature = "v6" but AFAIK rustc does not implement the v6 feature and I don't know if LLVM supports this.

@parched
Copy link

parched commented Oct 25, 2017

@gnzlbg I'm still not sure doing conditional compilation like that will work. E.g., where will the CLZ intrinsic go? It's available in the Arm instruction set in Armv5 but in the Thumb instruction set from Armv6T2. How about NEON instrinsics available in Armv8.2 but not Armv8?

IMO each instrinsic should be handled on a case by case basis and just group them by category.

ideally we should switch v6 on only when cfg_target_feature = "v6" but AFAIK rustc does not implement the v6 feature and I don't know if LLVM supports this.

LLVM does have target features for every architecture/version combination but currently rust doesn't expose any of them (including v7) unless I'm mistaken.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Oct 25, 2017

@parched

I'm still not sure doing conditional compilation like that will work. E.g., where will the CLZ intrinsic go? It's available in the Arm instruction set in Armv5 but in the Thumb instruction set from Armv6T2. How about NEON instrinsics available in Armv8.2 but not Armv8?

Currently, the granularity at which I can do arch and feature detection on rustc is very limited, but supposing that we would improve that, we could do something like this:

  • arm/: architecture module (available if arch is either arm or aarch64):
    • v5/: v5 sub-module
      • clz.rs: CLZ sub-module
      • ...: others
    • v6/: v6 sub-module
      • t2.rs: T2 sub-module (available when cfg_target_feature= "t2", includes v5/clz)
      • ...: others
    • v7/: v7 sub-module (available is cfg_target_feature = "v7")
      • neon.rs: neon sub-module (available is cfg_target_feature = "neon")
      • ...: others
  • aarch64/: architecture module (available if arch is aarch64)
    • a64.rs: A64 sub-module
    • neon.rs: neon sub-module (available is cfg_target_feature = "neon")

Note that at the end of the day this is still using conditional compilation based on arch and cfg_target_feature to provide meaningful intrinsics.

The hard part of the work is identifying all these groups, and writing the lists of intrinsics that belong to each. We can already start adding intrinsics here and organizing the library in a way that makes sense. Adding the necessary feature detection to rustc can be done in parallel it is pretty trivial if the features are supported by LLVM.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Nov 22, 2017

I don't like the arm split in the top module (even though this is what rustc uses). I think it would be better to just do:

  • arm/
    • a32/
      • v5/
      • v6/
      • .../
      • neon/
    • a64/
      • v7/
      • v8/
      • .../
      • asimd/

This should scale to thumb mode (which I hope we will be able to test soon), and should keep the top-level module clean as we add mips and altivec support.

@Amanieu
Copy link
Member

Amanieu commented Feb 26, 2018

IMO we should just do the same as clang and always provide the ror, rev*, clz, and other basic intrinsics. These will fall back to software implementations anyways if they aren't supported natively.

@gnzlbg
Copy link
Contributor Author

gnzlbg commented Feb 27, 2018

@Amanieu I was just going to ping you here.

The whole point of the intrinsics in stdsimd is that one is guaranteed to get the assembly they specify. It is then up to the user to make sure that they can call them at run-time, or to provide a software fallback if they want portability. So on archs that do not support some intrinsics, I'd prefer to just not expose them.

Exposing portable functionality is useful though, and the job of other std components, so I could imagine a portable reverse_bits intrinsic being added to core::intrinsics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants