-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] Split FeatureAES to FEAT_AES and FEAT_PMULL. #110816
Conversation
Currently in LLVM FeatureAES models both FEAT_AES and FEAT_PMULL lumped together. Similarly FeatureSVE2AES means FEAT_SVE_AES plus FEAT_SVE_PMULL128. However the architecture does not mandate that both need to be implemented at the same time. Splitting them will allow Function Multiversioning to enable backend support for 'aes' and 'sve2-aes'. I have added an override for the user visible names of the new features to preserve the old semantics for backwards compatibility with command line, target attribute and assembler directives.
Adding the GCC folks @andrewcarlotti and @Wilco1 for visibility. |
@llvm/pr-subscribers-clang @llvm/pr-subscribers-backend-aarch64 Author: Alexandros Lamprineas (labrinea) ChangesCurrently in LLVM FeatureAES models both FEAT_AES and FEAT_PMULL lumped together. Similarly FeatureSVE2AES means FEAT_SVE_AES plus FEAT_SVE_PMULL128. However the architecture does not mandate that both need to be implemented at the same time. Splitting them will allow Function Multiversioning to enable backend support for 'aes' and 'sve2-aes'. I have added an override for the user visible names of the new features to preserve the old semantics for backwards compatibility with command line, target attribute and assembler directives. Patch is 203.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/110816.diff 89 Files Affected:
diff --git a/clang/lib/Basic/Targets/AArch64.cpp b/clang/lib/Basic/Targets/AArch64.cpp
index 5f5dfcb722f9d4..9b0f744ac543bc 100644
--- a/clang/lib/Basic/Targets/AArch64.cpp
+++ b/clang/lib/Basic/Targets/AArch64.cpp
@@ -853,7 +853,7 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
HasSVE2 = true;
HasSVE2p1 = true;
}
- if (Feature == "+sve2-aes") {
+ if (Feature == "+sve2-aes" || Feature == "+sve2-pmull128") {
FPU |= NeonMode;
FPU |= SveMode;
HasFullFP16 = true;
@@ -963,7 +963,7 @@ bool AArch64TargetInfo::handleTargetFeatures(std::vector<std::string> &Features,
HasCRC = true;
if (Feature == "+rcpc")
HasRCPC = true;
- if (Feature == "+aes") {
+ if (Feature == "+aes" || Feature == "+pmull") {
FPU |= NeonMode;
HasAES = true;
}
diff --git a/clang/test/CodeGen/aarch64-fmv-dependencies.c b/clang/test/CodeGen/aarch64-fmv-dependencies.c
index 681f7e82634fa8..12d7ed34eaeaa7 100644
--- a/clang/test/CodeGen/aarch64-fmv-dependencies.c
+++ b/clang/test/CodeGen/aarch64-fmv-dependencies.c
@@ -3,7 +3,7 @@
// RUN: %clang --target=aarch64-linux-gnu --rtlib=compiler-rt -emit-llvm -S -o - %s | FileCheck %s
-// CHECK: define dso_local i32 @fmv._Maes() #[[ATTR0:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Maes() #[[aes:[0-9]+]] {
__attribute__((target_version("aes"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Mbf16() #[[bf16_ebf16:[0-9]+]] {
@@ -156,13 +156,13 @@ __attribute__((target_version("sve-i8mm"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msve2() #[[sve2:[0-9]+]] {
__attribute__((target_version("sve2"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Msve2-aes() #[[sve2_aes_sve2_pmull128:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Msve2-aes() #[[sve2_aes:[0-9]+]] {
__attribute__((target_version("sve2-aes"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msve2-bitperm() #[[sve2_bitperm:[0-9]+]] {
__attribute__((target_version("sve2-bitperm"))) int fmv(void) { return 0; }
-// CHECK: define dso_local i32 @fmv._Msve2-pmull128() #[[sve2_aes_sve2_pmull128:[0-9]+]] {
+// CHECK: define dso_local i32 @fmv._Msve2-pmull128() #[[sve2_pmull128:[0-9]+]] {
__attribute__((target_version("sve2-pmull128"))) int fmv(void) { return 0; }
// CHECK: define dso_local i32 @fmv._Msve2-sha3() #[[sve2_sha3:[0-9]+]] {
@@ -183,7 +183,7 @@ int caller() {
return fmv();
}
-// CHECK: attributes #[[ATTR0]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[bti]] = { {{.*}} "target-features"="+bti,+fp-armv8,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[crc]] = { {{.*}} "target-features"="+crc,+fp-armv8,+neon,+outline-atomics,+v8a"
@@ -205,7 +205,7 @@ int caller() {
// CHECK: attributes #[[lse]] = { {{.*}} "target-features"="+fp-armv8,+lse,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[memtag2]] = { {{.*}} "target-features"="+fp-armv8,+mte,+neon,+outline-atomics,+v8a"
// CHECK: attributes #[[mops]] = { {{.*}} "target-features"="+fp-armv8,+mops,+neon,+outline-atomics,+v8a"
-// CHECK: attributes #[[pmull]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+v8a"
+// CHECK: attributes #[[pmull]] = { {{.*}} "target-features"="+aes,+fp-armv8,+neon,+outline-atomics,+pmull,+v8a"
// CHECK: attributes #[[predres]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+predres,+v8a"
// CHECK: attributes #[[rcpc]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+v8a"
// CHECK: attributes #[[rcpc3]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+rcpc,+rcpc3,+v8a"
@@ -224,8 +224,9 @@ int caller() {
// CHECK: attributes #[[sve_bf16_ebf16]] = { {{.*}} "target-features"="+bf16,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve_i8mm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+i8mm,+neon,+outline-atomics,+sve,+v8a"
// CHECK: attributes #[[sve2]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+v8a"
-// CHECK: attributes #[[sve2_aes_sve2_pmull128]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-aes,+v8a"
+// CHECK: attributes #[[sve2_aes]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-aes,+v8a"
// CHECK: attributes #[[sve2_bitperm]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-bitperm,+v8a"
+// CHECK: attributes #[[sve2_pmull128]] = { {{.*}} "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+outline-atomics,+pmull,+sve,+sve2,+sve2-aes,+sve2-pmull128,+v8a"
// CHECK: attributes #[[sve2_sha3]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-sha3,+v8a"
// CHECK: attributes #[[sve2_sm4]] = { {{.*}} "target-features"="+fp-armv8,+fullfp16,+neon,+outline-atomics,+sve,+sve2,+sve2-sm4,+v8a"
// CHECK: attributes #[[wfxt]] = { {{.*}} "target-features"="+fp-armv8,+neon,+outline-atomics,+v8a,+wfxt"
diff --git a/clang/test/CodeGen/aarch64-targetattr.c b/clang/test/CodeGen/aarch64-targetattr.c
index 1bc78a6e1f8c0f..ce77c6145156b0 100644
--- a/clang/test/CodeGen/aarch64-targetattr.c
+++ b/clang/test/CodeGen/aarch64-targetattr.c
@@ -208,17 +208,17 @@ void applem4() {}
// CHECK: attributes #[[ATTR5]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR6]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+ete,+fp-armv8,+neon,+trbe,+v8a" }
// CHECK: attributes #[[ATTR7]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "tune-cpu"="generic" }
-// CHECK: attributes #[[ATTR8]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fullfp16,+lse,+neon,+perfmon,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+v8.1a,+v8.2a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR8]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+crc,+dotprod,+fp-armv8,+fullfp16,+lse,+neon,+perfmon,+pmull,+ras,+rcpc,+rdm,+sha2,+spe,+ssbs,+v8.1a,+v8.2a,+v8a" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR9]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #[[ATTR10]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a" }
-// CHECK: attributes #[[ATTR11]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a,-sve" }
+// CHECK: attributes #[[ATTR10]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+pmull,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a" }
+// CHECK: attributes #[[ATTR11]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-v1" "target-features"="+aes,+bf16,+ccdp,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+pmull,+rand,+ras,+rcpc,+rdm,+sha2,+sha3,+sm4,+spe,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8a,-sve" }
// CHECK: attributes #[[ATTR12]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+sve" }
// CHECK: attributes #[[ATTR13]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16" }
-// CHECK: attributes #[[ATTR14]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
-// CHECK: attributes #[[ATTR15]] = { noinline nounwind optnone "branch-target-enforcement" "guarded-control-stack" "no-trapping-math"="true" "sign-return-address"="non-leaf" "sign-return-address-key"="a_key" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR14]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+pmull,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
+// CHECK: attributes #[[ATTR15]] = { noinline nounwind optnone "branch-target-enforcement" "guarded-control-stack" "no-trapping-math"="true" "sign-return-address"="non-leaf" "sign-return-address-key"="a_key" "stack-protector-buffer-size"="8" "target-cpu"="neoverse-n1" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+pmull,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+spe,+ssbs,+sve,+sve2,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8a" "tune-cpu"="cortex-a710" }
// CHECK: attributes #[[ATTR16]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" }
// CHECK: attributes #[[ATTR17]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-v9.3a" }
-// CHECK: attributes #[[ATTR18]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m4" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fpac,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+sha3,+sme,+sme-f64f64,+sme-i16i64,+sme2,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8.7a,+v8a,+wfxt" }
+// CHECK: attributes #[[ATTR18]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="apple-m4" "target-features"="+aes,+bf16,+bti,+ccidx,+complxnum,+crc,+dit,+dotprod,+flagm,+fp-armv8,+fp16fml,+fpac,+fullfp16,+i8mm,+jsconv,+lse,+neon,+pauth,+perfmon,+pmull,+predres,+ras,+rcpc,+rdm,+sb,+sha2,+sha3,+sme,+sme-f64f64,+sme-i16i64,+sme2,+ssbs,+v8.1a,+v8.2a,+v8.3a,+v8.4a,+v8.5a,+v8.6a,+v8.7a,+v8a,+wfxt" }
//.
// CHECK: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
// CHECK: [[META1:![0-9]+]] = !{!"{{.*}}clang version {{.*}}"}
diff --git a/clang/test/CodeGen/arm64_crypto.c b/clang/test/CodeGen/arm64_crypto.c
index da6597be85bc80..5e6d59c490294f 100644
--- a/clang/test/CodeGen/arm64_crypto.c
+++ b/clang/test/CodeGen/arm64_crypto.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple arm64-apple-ios7.0 -target-feature +neon -target-feature +aes -target-feature +sha2 -ffreestanding -Os -S -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple arm64-apple-ios7.0 -target-feature +neon -target-feature +aes -target-feature +pmull -target-feature +sha2 -ffreestanding -Os -S -o - %s | FileCheck %s
// REQUIRES: aarch64-registered-target
diff --git a/clang/test/CodeGen/attr-target-clones-aarch64.c b/clang/test/CodeGen/attr-target-clones-aarch64.c
index 274e05de594b8e..ba3ffd5749ea68 100644
--- a/clang/test/CodeGen/attr-target-clones-aarch64.c
+++ b/clang/test/CodeGen/attr-target-clones-aarch64.c
@@ -824,7 +824,7 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-MTE-BTI-NEXT: ret ptr @ftc_inline3.default
//
//.
-// CHECK: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+lse,+neon" }
+// CHECK: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+aes,+fp-armv8,+lse,+neon" }
// CHECK: attributes #[[ATTR1:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2" }
// CHECK: attributes #[[ATTR2:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+sha2" }
// CHECK: attributes #[[ATTR3:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+mte,+neon,+sha2" }
@@ -837,13 +837,13 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK: attributes #[[ATTR10:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+complxnum,+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-bitperm" }
// CHECK: attributes #[[ATTR11:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+rand" }
// CHECK: attributes #[[ATTR12:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+predres,+rcpc" }
-// CHECK: attributes #[[ATTR13:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
+// CHECK: attributes #[[ATTR13:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+aes,+fp-armv8,+fullfp16,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
// CHECK: attributes #[[ATTR14:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+fullfp16,+neon,+sb,+sve" }
//.
// CHECK-NOFMV: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
// CHECK-NOFMV: attributes #[[ATTR1:[0-9]+]] = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="-fmv" }
//.
-// CHECK-MTE-BTI: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+lse,+mte,+neon" }
+// CHECK-MTE-BTI: attributes #[[ATTR0:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+aes,+bti,+fp-armv8,+lse,+mte,+neon" }
// CHECK-MTE-BTI: attributes #[[ATTR1:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+fullfp16,+mte,+neon,+sve,+sve2" }
// CHECK-MTE-BTI: attributes #[[ATTR2:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+mte,+neon,+sha2" }
// CHECK-MTE-BTI: attributes #[[ATTR3:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+mte,+neon" }
@@ -853,7 +853,7 @@ inline int __attribute__((target_clones("fp16", "sve2-bitperm+fcma", "default"))
// CHECK-MTE-BTI: attributes #[[ATTR7:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+complxnum,+fp-armv8,+fullfp16,+mte,+neon,+sve,+sve2,+sve2-bitperm" }
// CHECK-MTE-BTI: attributes #[[ATTR8:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+mte,+neon,+rand" }
// CHECK-MTE-BTI: attributes #[[ATTR9:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+mte,+predres,+rcpc" }
-// CHECK-MTE-BTI: attributes #[[ATTR10:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+fullfp16,+mte,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
+// CHECK-MTE-BTI: attributes #[[ATTR10:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+aes,+bti,+fp-armv8,+fullfp16,+mte,+neon,+sve,+sve2,+sve2-aes,+wfxt" }
// CHECK-MTE-BTI: attributes #[[ATTR11:[0-9]+]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+bti,+fp-armv8,+fullfp16,+mte,+neon,+sb,+sve" }
//.
// CHECK: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
diff --git a/clang/test/CodeGen/attr-target-version.c b/clang/test/CodeGen/attr-target-version.c
index 228435a0494c3e..f3314bb5c32173 100644
--- a/clang/test/CodeGen/attr-target-version.c
+++ b/clang/test/CodeGen/attr-target-version.c
@@ -242,14 +242,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Mfp
-// CHECK-SAME: () #[[ATTR5]] {
+// CHECK-SAME: () #[[ATTR12:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 1
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Msimd
-// CHECK-SAME: () #[[ATTR5]] {
+// CHECK-SAME: () #[[ATTR12]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 2
//
@@ -263,7 +263,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@fmv_two._Mfp16Msimd
-// CHECK-SAME: () #[[ATTR12:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR13:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 4
//
@@ -354,14 +354,14 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_forward_default_decl._Mmops
-// CHECK-SAME: () #[[ATTR14:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR15:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_implicit_extern_forward_default_decl._Mdotprod
-// CHECK-SAME: () #[[ATTR15:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR16:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -375,7 +375,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_default_def._Msve
-// CHECK-SAME: () #[[ATTR16:[0-9]+]] {
+// CHECK-SAME: () #[[ATTR17:[0-9]+]] {
// CHECK-NEXT: entry:
// CHECK-NEXT: ret i32 0
//
@@ -389,7 +389,7 @@ int caller(void) { return used_def_without_default_decl() + used_decl_without_de
//
// CHECK: Function Attrs: noinline nounwind optnone
// CHECK-LABEL: define {{[^@]+}}@unused_with_implicit_default_def._Mfp16
-// CHECK-SAME: () #[[ATTR12]] {
+// CHECK-SAM...
[truncated]
|
// CHECK: attributes #[[ATTR35]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+sm4" } | ||
// CHECK: attributes #[[ATTR36]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+lse,+neon,+rdm" } | ||
// CHECK: attributes #[[ATTR37:[0-9]+]] = { "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon,+rdm" } | ||
// CHECK: attributes #[[ATTR12]] = { noinline nounwind optnone "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+fp-armv8,+neon" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These attributes are autogenerated from update_cc_tests_check.py and are of no interest. Unfortunately I can't make the script not generate them so please ignore this part of the diff.
def : FMVExtension<"sve2-bitperm", "FEAT_SVE_BITPERM", "+sve2,+sve,+sve2-bitperm,+fullfp16,+fp-armv8,+neon", 400>; | ||
def : FMVExtension<"sve2-pmull128", "FEAT_SVE_PMULL128", "+sve2,+sve,+sve2-aes,+fullfp16,+fp-armv8,+neon", 390>; | ||
def : FMVExtension<"sve2-pmull128", "FEAT_SVE_PMULL128", "+pmull,+aes,+sve2,+sve,+sve2-pmull128,+sve2-aes,+fullfp16,+fp-armv8,+neon", 390>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hand writing these dependencies is so error prone. Once we review the remaining FMV features I am planning to autogenerate those from tablegen. For example the two lines below are already wrong. sve2-sha3
should enable +sha3 and sve2-sm4
should enable +sm4.
This premise is incorrect. For FEAT_SVE_AES and FEAT_SVE_PMULL128, the latest version of the Arm ARM (DDI 0487K.a) includes the following in the definition of
So it isn't permitted to implement just one of FEAT_SVE_AES and FEAT_SVE_PMULL128. Similarly, for FEAT_AES and FEAT_PMULL, the previous version of the Arm ARM (DDI 0487J.a) includes the following in the definition of
This last line was deleted in the latest Arm ARM (DDI 0487K.a), but it appears to be a mistake that was not intended to relax the architecture constraints. I've reported this discrepancy internally. |
I originally tried splitting these features (see relevant pull reguest llvm/llvm-project#110816), but the following came to my attention: According to https://developer.arm.com/documentation/ddi0487/latest Arm Architecture Reference Manual for A-profile architecture: D23.2.83 ID_AA64ZFR0_EL1, SVE Feature ID Register 0 ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. > FEAT_SVE_PMULL128 implements the functionality identified by the value 0b0010. > The permitted values are 0b0000 and 0b0010. Andrew Carlotti suggests that the same applies for ID_AA64ISAR0_EL1.AES (llvm/llvm-project#110816 (comment)) D19.2.61 ID_AA64ISAR0_EL1, AArch64 Instruction Set Attribute Register 0 ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. > FEAT_PMULL implements the functionality identified by the value 0b0010. > From Armv8, the permitted values are 0b0000 and 0b0010. This was removed from the latest release of the Arm Architecture Reference Manual, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported.
I originally tried splitting these features (see relevant pull reguest llvm/llvm-project#110816), but the following came to my attention: According to https://developer.arm.com/documentation/ddi0487/latest Arm Architecture Reference Manual for A-profile architecture: D23.2.83 ID_AA64ZFR0_EL1, SVE Feature ID Register 0 ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. > FEAT_SVE_PMULL128 implements the functionality identified by the value 0b0010. > The permitted values are 0b0000 and 0b0010. Andrew Carlotti suggests that the same applies for ID_AA64ISAR0_EL1.AES (llvm/llvm-project#110816 (comment)) D19.2.61 ID_AA64ISAR0_EL1, AArch64 Instruction Set Attribute Register 0 ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. > FEAT_PMULL implements the functionality identified by the value 0b0010. > From Armv8, the permitted values are 0b0000 and 0b0010. This was removed from the latest release of the Arm Architecture Reference Manual, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported.
I originally tried splitting these features (see relevant pull request llvm/llvm-project#110816), but the following came to my attention: According to https://developer.arm.com/documentation/ddi0487/latest Arm Architecture Reference Manual for A-profile architecture: D23.2.83 ID_AA64ZFR0_EL1, SVE Feature ID Register 0 ID_AA64ZFR0_EL1.AES, bits [7:4] > FEAT_SVE_AES implements the functionality identified by the value 0b0001. > FEAT_SVE_PMULL128 implements the functionality identified by the value 0b0010. > The permitted values are 0b0000 and 0b0010. Andrew Carlotti suggests that the same applies for ID_AA64ISAR0_EL1.AES (llvm/llvm-project#110816 (comment)) D19.2.61 ID_AA64ISAR0_EL1, AArch64 Instruction Set Attribute Register 0 ID_AA64ISAR0_EL1.AES, bits [7:4] > FEAT_AES implements the functionality identified by the value 0b0001. > FEAT_PMULL implements the functionality identified by the value 0b0010. > From Armv8, the permitted values are 0b0000 and 0b0010. This was removed from the latest release of the Arm Architecture Reference Manual, but it appears to be a mistake that was not intended to relax the architecture constraints. The discrepancy has been reported.
Currently in LLVM FeatureAES models both FEAT_AES and FEAT_PMULL lumped together. Similarly FeatureSVE2AES means FEAT_SVE_AES plus FEAT_SVE_PMULL128. However the architecture does not mandate that both need to be implemented at the same time. Splitting them will allow Function Multiversioning to enable backend support for 'aes' and 'sve2-aes'. I have added an override for the user visible names of the new features to preserve the old semantics for backwards compatibility with command line, target attribute and assembler directives.