Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AMDGPU: Add description for new atomicrmw metadata #85052

Merged
merged 24 commits into from
Jul 10, 2024

Conversation

arsenm
Copy link
Contributor

@arsenm arsenm commented Mar 13, 2024

Add a spec for yet-to-be-implemented metadata to allow the backend to
fully handle atomicrmw lowering. This is the base of an alternative
to #69229, which inverts the direction to be correct by default, and
extends to cover the peer device case.

Could use a better name

arsenm added 2 commits March 13, 2024 14:19
I couldn't figure out how to nicely embed a table within a table column.
Copy the formatting that LangRef uses for metadata, and introduce a
metadata section with subsections for each item. Also fix using subsection
markers in place of section markers to avoid sphinx errors.
Add a spec for yet-to-be-implemented metadata to allow the backend to
fully handle atomicrmw lowering. This is the base of an alternative
to llvm#69229, which inverts the direction to be correct by default, and
extends to cover the peer device case.

Could use a better name
@llvmbot
Copy link
Member

llvmbot commented Mar 13, 2024

@llvm/pr-subscribers-backend-amdgpu

Author: Matt Arsenault (arsenm)

Changes

Add a spec for yet-to-be-implemented metadata to allow the backend to
fully handle atomicrmw lowering. This is the base of an alternative
to #69229, which inverts the direction to be correct by default, and
extends to cover the peer device case.

Could use a better name


Full diff: https://github.com/llvm/llvm-project/pull/85052.diff

2 Files Affected:

  • (modified) llvm/docs/AMDGPUUsage.rst (+61-12)
  • (modified) llvm/docs/ReleaseNotes.rst (+2)
diff --git a/llvm/docs/AMDGPUUsage.rst b/llvm/docs/AMDGPUUsage.rst
index fd9ad7fac19a95..a6556bbd1752b9 100644
--- a/llvm/docs/AMDGPUUsage.rst
+++ b/llvm/docs/AMDGPUUsage.rst
@@ -1312,24 +1312,73 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
 
    List AMDGPU intrinsics.
 
+.. _amdgpu_metadata:
+
 LLVM IR Metadata
-------------------
+================
+
+The AMDGPU backend implements the following target custom LLVM IR
+metadata.
+
+.. _amdgpu_last_use:
+
+'``amdgpu.last.use``' Metadata
+------------------------------
+
+Sets TH_LOAD_LU temporal hint on load instructions that support it.
+Takes priority over nontemporal hint (TH_LOAD_NT). This takes no
+arguments.
+
+.. code-block:: llvm
+
+  %val = load i32, ptr %in, align 4, !amdgpu.last.use !{}
 
-The AMDGPU backend implements the following LLVM IR metadata.
+.. _amdgpu_no_access_location_types:
 
-.. list-table:: AMDGPU LLVM IR Metatdata
-  :name: amdgpu-llvm-ir-metadata-table
+'``amdgpu.no.access.location.types``' Metadata
+----------------------------------------------
 
-  * - Metadata Name
+Asserts a memory access does not access bytes residing in certain
+allocation kinds. This is intended for use with :ref:`atomicrmw
+<i_atomicrmw>` and other atomic instructions. This is required to emit
+a native hardware instruction for some :ref:`system scope
+<amdgpu-memory-scopes>` atomic operations on some subtargets. An
+:ref:`atomicrmw <i_atomicrmw>` without metadata will be treated
+conservatively as required to preserve the operation behavior in all
+cases.
+
+If the memory operation does access an address in an indicated region,
+any stored values and any returned results are :ref:`poison
+<poisonvalues>`. This has a single integer argument, interpreted as a
+bitfield. A 0 value is equivalent to removing the metadata.
+
+.. list-table::
+
+  * - Bit
     - Description
-    - Values
-  * - !amdgpu.last.use
-    - Sets TH_LOAD_LU temporal hint on load instructions that support it.
-      Takes priority over nontemporal hint (TH_LOAD_NT).
-    - {}
+  * - 0
+    - Not in fine-grained host memory.
+  * - 1
+    - Not in a remote connected peer device (address must be device local)
+
+.. code-block:: llvm
+
+  ; Indicates the access does not access fine-grained memory, or
+  ; remote device memory.
+  %old0 = atomicrmw sub ptr %ptr0, i32 1 acquire, !amdgpu.no.access.location.types !0
+
+  ; Indicates the access does not access fine-grained memory.
+  %old1 = atomicrmw sub ptr %ptr1, i32 1 acquire, !amdgpu.no.access.location.types !1
+
+  ; Indicates the access does not access peer device memory.
+  %old2 = atomicrmw sub ptr %ptr2, i32 1 acquire, !amdgpu.no.access.location.types !2
+
+  !0 = !{i32 3}
+  !1 = !{i32 1}
+  !2 = !{i32 2}
 
 LLVM IR Attributes
-------------------
+==================
 
 The AMDGPU backend supports the following LLVM IR attributes.
 
@@ -1451,7 +1500,7 @@ The AMDGPU backend supports the following LLVM IR attributes.
      ======================================= ==========================================================
 
 Calling Conventions
--------------------
+===================
 
 The AMDGPU backend supports the following calling conventions:
 
diff --git a/llvm/docs/ReleaseNotes.rst b/llvm/docs/ReleaseNotes.rst
index b34a5f31c5eb0a..95ebbb74fbbd7f 100644
--- a/llvm/docs/ReleaseNotes.rst
+++ b/llvm/docs/ReleaseNotes.rst
@@ -71,6 +71,8 @@ Changes to the AMDGPU Backend
 -----------------------------
 
 * Implemented the ``llvm.get.fpenv`` and ``llvm.set.fpenv`` intrinsics.
+* Added ``!amdgpu.no.access.location.types`` metadata to control
+  atomic behavior.
 
 Changes to the ARM Backend
 --------------------------

llvm/docs/AMDGPUUsage.rst Outdated Show resolved Hide resolved
@yxsamliu yxsamliu requested a review from b-sumner March 13, 2024 14:41
llvm/docs/AMDGPUUsage.rst Outdated Show resolved Hide resolved
llvm/docs/AMDGPUUsage.rst Outdated Show resolved Hide resolved
@arsenm arsenm changed the title AMDGPU: Add description for amdgpu.no.access.location.types metadata AMDGPU: Add description for new atomicrmw metadata Apr 15, 2024
@arsenm arsenm requested a review from yxsamliu April 19, 2024 13:53
@yxsamliu
Copy link
Collaborator

LGTM

@arsenm
Copy link
Contributor Author

arsenm commented Apr 19, 2024

Wondering if both pieces should end in "memory" or "memory.access" for consistency

@yxsamliu
Copy link
Collaborator

Wondering if both pieces should end in "memory" or "memory.access" for consistency

I think "access" is unnecessary.

@b-sumner
Copy link

Wondering if both pieces should end in "memory" or "memory.access" for consistency

I think "access" is unnecessary.

I tend to agree.

arsenm added a commit that referenced this pull request Apr 19, 2024
Add baseline tests which should comprehensively test the new atomic
metadata. Test codegen / expansion, and preservation in a few
transforms.

New metadata defined in #85052
aniplcc pushed a commit to aniplcc/llvm-project that referenced this pull request Apr 21, 2024
Add baseline tests which should comprehensively test the new atomic
metadata. Test codegen / expansion, and preservation in a few
transforms.

New metadata defined in llvm#85052
hdelan added a commit to hdelan/llvm that referenced this pull request Apr 22, 2024
AMDGPU reflect pass is needed to choose between safe and unsafe atomics
at the libclc level. In the long run we will delete this patch as work
is being done to ensure correct lowering of atomic instructions. See
patches:

llvm/llvm-project#85052
llvm/llvm-project#69229

This work is necessary as malloc shared atomics rely on PCIe atomics
which can have patchy and unreliable support. We want to therefore be
able to choose at compile time whether we should use safe atomics using
CAS (which PCIe should support), or if we want to rely of the
availability of the newest PCIe atomics, if malloc shared atomics are
desired.

Also changes the implementation of Or, And so that they can choose
between the safe or unsafe version based on the AMDGPU reflect value.
ldrumm pushed a commit to intel/llvm that referenced this pull request Apr 24, 2024
… AMDGPU atomics (#11467)

AMDGPU reflect pass is needed to choose between safe and unsafe atomics
at the libclc level. In the long run we will delete this patch as work
is being done to ensure correct lowering of atomic instructions. See
patches:

llvm/llvm-project#85052
llvm/llvm-project#69229

This work is necessary as malloc shared atomics rely on PCIe atomics
which can have patchy and unreliable support. Therefore, we want to be
able to choose at compile time whether we should use safe atomics using
CAS (which PCIe should support), or if we want to rely of the
availability of the newest PCIe atomics, if malloc shared atomics are
desired.

Also changes the implementation of `atomic_or`, `atomic_and` so that
they
can choose between the safe or unsafe version based on the AMDGPU
reflect value.
Copy link
Contributor Author

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ping

@arsenm
Copy link
Contributor Author

arsenm commented Jul 10, 2024

ping

@arsenm arsenm merged commit 62d9497 into llvm:main Jul 10, 2024
8 checks passed
@arsenm arsenm deleted the amdgpu-atomic-metadata branch July 10, 2024 13:39
aaryanshukla pushed a commit to aaryanshukla/llvm-project that referenced this pull request Jul 14, 2024
Add a spec for yet-to-be-implemented metadata to allow the backend to
fully handle atomicrmw lowering. This is the base of an alternative
to llvm#69229, which inverts the direction to be correct by default, and
extends to cover the peer device case.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants