From e74ca1be822858432d7d9002574bb349f02b5ede Mon Sep 17 00:00:00 2001 From: Sander de Smalen Date: Fri, 26 Jul 2024 15:54:24 +0100 Subject: [PATCH 1/5] [SME] Add __arm_agnostic("sme_za_state") keyword attribute The __arm_agnostic keyword attribute enables the user to specify that a function is agnostic to a specified piece of architectural state. That means that the function must preserve this state when it exists, or otherwise ignores its contents. The reason for not naming this something like '__arm_za_compatible' was so that we might want use the attribute keyword for other architectural state in the future. --- main/acle.md | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/main/acle.md b/main/acle.md index 34b1283d..fcc04a27 100644 --- a/main/acle.md +++ b/main/acle.md @@ -400,6 +400,7 @@ Armv8.4-A [[ARMARMv84]](#ARMARMv84). Support is added for the Dot Product intrin * Added a requirement for function version declaration in Function Multi Versioning. * Fixed some rendering issues in the online Markdown documentation and fixed a misplaced anchor. +* Added [`__arm_agnostic`](#arm_agnostic) keyword attribute ### References @@ -832,6 +833,7 @@ predefine the associated macro to a nonzero value. | [`__arm_new`](#arm_new) | function declaration | Argument-dependent | | [`__arm_out`](#ways-of-sharing-state) | function type | Argument-dependent | | [`__arm_preserves`](#ways-of-sharing-state) | function type | Argument-dependent | +| [`__arm_agnostic`](#arm_agnostic) | function type | `__ARM_FEATURE_SME` | | [`__arm_streaming`](#arm_streaming) | function type | `__ARM_FEATURE_SME` | | [`__arm_streaming_compatible`](#arm_streaming_compatible) | function type | `__ARM_FEATURE_SME` | @@ -4836,6 +4838,47 @@ if such a restoration is necessary. For example: } ``` +## `__arm_agnostic` + +A function with the `__arm_agnostic` [keyword attribute](#keyword-attributes) must +preserve the architectural state that is specified by its arguments when such +state exists at runtime. The function is otherwise unconcerned with the contents +of this state. + +The `__arm_agnostic` [keyword attribute](#keyword-attributes) applies to +**function types** and accepts the following arguments: + +```"sme_za_state"``` + +* If the function is defined and PSTATE.ZA is available, the definition must + preserve all architectural state enabled by PSTATE.ZA. + + It is the compiler's responsibility to ensure that this state is preserved. + + The compiled function must be forward-compatible and should not make any + assumptions on what state is enabled by PSTATE.ZA, which could be done by + relying on ABI routines available on the platform for the allocation of + a buffer and the saving and restoring of state. + +* If PSTATE.ZA is available, then the [abstract machine](#abstract-machine) + ensures that on return from the function, the value of PSTATE.ZA is the same + as it was on entry to the function. + +* If the function forms part of the object code's ABI, that object code + function has a “ZA-compatible interface”; see [[AAPCS64]](#AAPCS64) + for more details. + +* It is not valid for a function declaration with `__arm_agnostic("sme_za_state")` + to be combined with any of the following [keyword attributes](#keyword-attributes): + * `__arm_new()` + * `__arm_in()` + * `__arm_out()` + * `__arm_inout()` + * `__arm_preserves()` + + when `` describes state that is enabled by PSTATE.ZA. + + ## Mapping to the Procedure Call Standard [[AAPCS64]](#AAPCS64) classifies functions as having one of the following From c7c06937d2ef4c62763edd9fb92cb2e74820d12d Mon Sep 17 00:00:00 2001 From: Sander de Smalen Date: Tue, 3 Sep 2024 11:42:00 +0100 Subject: [PATCH 2/5] Address comments --- main/acle.md | 61 +++++++++++++++++++++++++--------------------------- 1 file changed, 29 insertions(+), 32 deletions(-) diff --git a/main/acle.md b/main/acle.md index fcc04a27..08a5c1c8 100644 --- a/main/acle.md +++ b/main/acle.md @@ -827,13 +827,13 @@ predefine the associated macro to a nonzero value. | **Name** | **Target** | **Predefined macro** | | ----------------------------------------------------------- | --------------------- | --------------------------------- | +| [`__arm_agnostic`](#arm_agnostic) | function type | `__ARM_FEATURE_SME` | | [`__arm_locally_streaming`](#arm_locally_streaming) | function declaration | `__ARM_FEATURE_LOCALLY_STREAMING` | | [`__arm_in`](#ways-of-sharing-state) | function type | Argument-dependent | | [`__arm_inout`](#ways-of-sharing-state) | function type | Argument-dependent | | [`__arm_new`](#arm_new) | function declaration | Argument-dependent | | [`__arm_out`](#ways-of-sharing-state) | function type | Argument-dependent | | [`__arm_preserves`](#ways-of-sharing-state) | function type | Argument-dependent | -| [`__arm_agnostic`](#arm_agnostic) | function type | `__ARM_FEATURE_SME` | | [`__arm_streaming`](#arm_streaming) | function type | `__ARM_FEATURE_SME` | | [`__arm_streaming_compatible`](#arm_streaming_compatible) | function type | `__ARM_FEATURE_SME` | @@ -4840,44 +4840,27 @@ if such a restoration is necessary. For example: ## `__arm_agnostic` -A function with the `__arm_agnostic` [keyword attribute](#keyword-attributes) must -preserve the architectural state that is specified by its arguments when such -state exists at runtime. The function is otherwise unconcerned with the contents -of this state. +A function with the `__arm_agnostic` [keyword attribute](#keyword-attributes) +must preserve the architectural state that is specified by its arguments when +such state exists at runtime. The function is otherwise unconcerned with this +state. The `__arm_agnostic` [keyword attribute](#keyword-attributes) applies to **function types** and accepts the following arguments: ```"sme_za_state"``` -* If the function is defined and PSTATE.ZA is available, the definition must - preserve all architectural state enabled by PSTATE.ZA. - - It is the compiler's responsibility to ensure that this state is preserved. - - The compiled function must be forward-compatible and should not make any - assumptions on what state is enabled by PSTATE.ZA, which could be done by - relying on ABI routines available on the platform for the allocation of - a buffer and the saving and restoring of state. - -* If PSTATE.ZA is available, then the [abstract machine](#abstract-machine) - ensures that on return from the function, the value of PSTATE.ZA is the same - as it was on entry to the function. - -* If the function forms part of the object code's ABI, that object code - function has a “ZA-compatible interface”; see [[AAPCS64]](#AAPCS64) - for more details. - -* It is not valid for a function declaration with `__arm_agnostic("sme_za_state")` - to be combined with any of the following [keyword attributes](#keyword-attributes): - * `__arm_new()` - * `__arm_in()` - * `__arm_out()` - * `__arm_inout()` - * `__arm_preserves()` +* This attribute affects the ABI of a function, which must implement an + [agnostic-ZA interface](#agnostic-za). It is the compiler's responsibility + to ensure that the function's object code honors the ABI requirements. - when `` describes state that is enabled by PSTATE.ZA. +* The use of `__arm_agnostic("sme_za_state")` allows writing functions that + are compatible with ZA state without having to share ZA state with the + caller, as required by `__arm_preserves`. +* It is not valid for a function declaration with + `__arm_agnostic("sme_za_state")` to [share](#shares-state) PSTATE.ZA state + with its caller. ## Mapping to the Procedure Call Standard @@ -4890,6 +4873,10 @@ interfaces: * a “shared-ZA” interface + + +* a "agnostic-ZA" interface + If a C or C++ function F forms part of the object code's ABI, that object code function has a shared-ZA interface if and only if at least one of the following is true: @@ -4898,7 +4885,17 @@ one of the following is true: * F shares ZT0 with its caller -All other functions have a private-ZA interface. +All other functions have either a private-ZA or an agnostic-ZA interface. + +If F implements an agnostic-ZA interface and PSTATE.ZA is available at runtime, +then a call to F must return with its ZA state unchanged in accordance +to the [[AAPCS64]](#AAPCS64). In practice this means that calls to F don't have +to emit code to set up a lazy-save for ZA or to preserve other state like ZT0 +when such state is live at the call site. + +The implementation of F must not make any assumptions on the availability of +PSTATE.ZA or any architectural state associated with it. + ## Function definitions From ecd9735c6803abfc53ffefbb9ef4a71895c32b86 Mon Sep 17 00:00:00 2001 From: Sander de Smalen Date: Mon, 16 Dec 2024 09:29:08 +0000 Subject: [PATCH 3/5] Address comments --- main/acle.md | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/main/acle.md b/main/acle.md index 08a5c1c8..96890036 100644 --- a/main/acle.md +++ b/main/acle.md @@ -4856,7 +4856,8 @@ The `__arm_agnostic` [keyword attribute](#keyword-attributes) applies to * The use of `__arm_agnostic("sme_za_state")` allows writing functions that are compatible with ZA state without having to share ZA state with the - caller, as required by `__arm_preserves`. + caller, as required by `__arm_preserves`. The use of this attribute + does not imply that SME is available. * It is not valid for a function declaration with `__arm_agnostic("sme_za_state")` to [share](#shares-state) PSTATE.ZA state @@ -4875,23 +4876,27 @@ interfaces: -* a "agnostic-ZA" interface +* an "agnostic-ZA" interface -If a C or C++ function F forms part of the object code's ABI, that -object code function has a shared-ZA interface if and only if at least -one of the following is true: +If a C or C++ function F forms part of the object code's ABI: -* F shares ZA with its caller +* the object code function has a shared-ZA interface if and only if at least + one of the following is true: -* F shares ZT0 with its caller + * F shares ZA with its caller -All other functions have either a private-ZA or an agnostic-ZA interface. + * F shares ZT0 with its caller -If F implements an agnostic-ZA interface and PSTATE.ZA is available at runtime, -then a call to F must return with its ZA state unchanged in accordance -to the [[AAPCS64]](#AAPCS64). In practice this means that calls to F don't have -to emit code to set up a lazy-save for ZA or to preserve other state like ZT0 -when such state is live at the call site. +* the object code function has an agnostic-ZA interface if and only if F's type + has an `__arm_agnostic("sme_za_state")` attribute. + +All other functions have a private-ZA interface. + +If F implements an agnostic-ZA interface and ZA is active, then a call to F must +return with its ZA state unchanged in accordance with the [[AAPCS64]](#AAPCS64). +In practice this means that calls to F don't have to emit code to set up a +lazy-save for ZA or to preserve other state like ZT0 when such state is live at +the call site. The implementation of F must not make any assumptions on the availability of PSTATE.ZA or any architectural state associated with it. @@ -4982,6 +4987,8 @@ following is true: * F [uses](#uses-state) `"zt0"` +* F implements an [agnostic ZA](#agnostic-za) interface. + Otherwise, ZA can be off or dormant on entry to A, as for what AAPCS64 calls “private-ZA” functions. From 234cf33bad23b6933cd3ba8c72df12da93b3fcdf Mon Sep 17 00:00:00 2001 From: Sander de Smalen Date: Fri, 20 Dec 2024 10:30:51 +0000 Subject: [PATCH 4/5] Describe clobbers with __arm_agnostic("sme_za_state") --- main/acle.md | 40 +++++++++++++++++++++++----------------- 1 file changed, 23 insertions(+), 17 deletions(-) diff --git a/main/acle.md b/main/acle.md index 96890036..37f6aade 100644 --- a/main/acle.md +++ b/main/acle.md @@ -4892,15 +4892,6 @@ If a C or C++ function F forms part of the object code's ABI: All other functions have a private-ZA interface. -If F implements an agnostic-ZA interface and ZA is active, then a call to F must -return with its ZA state unchanged in accordance with the [[AAPCS64]](#AAPCS64). -In practice this means that calls to F don't have to emit code to set up a -lazy-save for ZA or to preserve other state like ZT0 when such state is live at -the call site. - -The implementation of F must not make any assumptions on the availability of -PSTATE.ZA or any architectural state associated with it. - ## Function definitions @@ -4983,14 +4974,22 @@ function F if at least one of the following is true: Otherwise, ZA can be in any state on entry to A if at least one of the following is true: -* F [uses](#uses-state) `"za"` +* F [uses](#uses-state) `"za"`. + +* F [uses](#uses-state) `"zt0"`. + +* F is what the [[AAPCS64]](#AAPCS64) calls an ["agnostic-ZA"](#agnostic-za) + function and A's clobber-list does not include `"za"` or `"zt0"`. + +Otherwise, ZA can be off or dormant on entry to A if at least one of the +following is true: -* F [uses](#uses-state) `"zt0"` +* F is what the [[AAPCS64]](#AAPCS64) calls a "private-ZA" function. -* F implements an [agnostic ZA](#agnostic-za) interface. +* F is what the [[AAPCS64]](#AAPCS64) calls an ["agnostic-ZA"](#agnostic-za) + function and A's clobber list includes `"za"` or `"zt0"`. -Otherwise, ZA can be off or dormant on entry to A, as for what AAPCS64 -calls “private-ZA” functions. +Otherwise, ZA is off on entry to A. If ZA is active on entry to A then A's instructions must ensure that ZA is also active when the asm finishes. @@ -5015,9 +5014,16 @@ depend on ZT0 as well as ZA. | **ZA state before A** | **ZA state after A** | **Possible if…** | | --------------------- | -------------------- | -------------------------------------- | | off | off | F's uses and A's clobbers are disjoint | -| dormant | dormant | " " " | -| dormant | off | " " ", and A clobbers `"za"` | -| active | active | F uses `"za"` and/or `"zt0"` | +| | | or F is an | +| | | ["#agnostic-ZA"](#agnostic-za) | +| | | function. | +| dormant | dormant | "" "" "" | +| dormant | off | A clobbers `"za"`, but F does not | +| | | [use](#uses-state) `"za"`or `"zt0"`. | +| active | active | F uses `"za"` and/or `"zt0"`, or | +| | | F is an ["#agnostic-ZA"](#agnostic-za) | +| | | function and A's clobbers do not | +| | | contain `"za"` or `"zt0"`. | The [`__ARM_STATE` macros](#state-strings) indicate whether a compiler is guaranteed to support a particular clobber string. For example, From ee36004c8b4226ac6aa632553a8288cb9e58ef1c Mon Sep 17 00:00:00 2001 From: Sander de Smalen Date: Fri, 20 Dec 2024 15:05:15 +0000 Subject: [PATCH 5/5] Address comments --- main/acle.md | 30 ++++++++++-------------------- 1 file changed, 10 insertions(+), 20 deletions(-) diff --git a/main/acle.md b/main/acle.md index 37f6aade..a69cbf2a 100644 --- a/main/acle.md +++ b/main/acle.md @@ -4978,18 +4978,11 @@ following is true: * F [uses](#uses-state) `"zt0"`. -* F is what the [[AAPCS64]](#AAPCS64) calls an ["agnostic-ZA"](#agnostic-za) - function and A's clobber-list does not include `"za"` or `"zt0"`. +* F's type has an [`__arm_agnostic("sme_za_state")` attribute](#agnostic-za) + and A's clobber-list includes neither `"za"` nor `"zt0"`. -Otherwise, ZA can be off or dormant on entry to A if at least one of the -following is true: - -* F is what the [[AAPCS64]](#AAPCS64) calls a "private-ZA" function. - -* F is what the [[AAPCS64]](#AAPCS64) calls an ["agnostic-ZA"](#agnostic-za) - function and A's clobber list includes `"za"` or `"zt0"`. - -Otherwise, ZA is off on entry to A. +Otherwise, ZA can be off or dormant on entry to A, in the same way as if F were +to call what the [[AAPCS64]](#AAPCS64) describes as a "private-ZA" function. If ZA is active on entry to A then A's instructions must ensure that ZA is also active when the asm finishes. @@ -5014,16 +5007,13 @@ depend on ZT0 as well as ZA. | **ZA state before A** | **ZA state after A** | **Possible if…** | | --------------------- | -------------------- | -------------------------------------- | | off | off | F's uses and A's clobbers are disjoint | -| | | or F is an | -| | | ["#agnostic-ZA"](#agnostic-za) | -| | | function. | -| dormant | dormant | "" "" "" | -| dormant | off | A clobbers `"za"`, but F does not | -| | | [use](#uses-state) `"za"`or `"zt0"`. | +| dormant | dormant | " " " | +| dormant | off | " " ", and A clobbers `"za"` | | active | active | F uses `"za"` and/or `"zt0"`, or | -| | | F is an ["#agnostic-ZA"](#agnostic-za) | -| | | function and A's clobbers do not | -| | | contain `"za"` or `"zt0"`. | +| | | F's type has an | +| | | `__arm_agnostic("sme_za_state")` | +| | | attribute with A's clobber-list | +| | | including neither `"za"` nor `"zt0"` | The [`__ARM_STATE` macros](#state-strings) indicate whether a compiler is guaranteed to support a particular clobber string. For example,