Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TBAA] Only emit pointer tbaa metedata for record types. #116991

Merged
merged 5 commits into from
Nov 21, 2024

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Nov 20, 2024

Be conservative if the type isn't a record type. Handling other types may
require stripping const-qualifiers inside the type, e.g. MemberPointerType.

Also look through array types same as through pointer types, to not pessimize
arrays of pointers.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.

@fhahn fhahn requested a review from Endilll as a code owner November 20, 2024 15:42
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen labels Nov 20, 2024
@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-clang-codegen

Author: Florian Hahn (fhahn)

Changes

Be conservative if the type isn't a record type. Handling other types may
require stripping const-qualifiers inside the type, e.g. MemberPointerType.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.


Full diff: https://github.com/llvm/llvm-project/pull/116991.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+6)
  • (modified) clang/test/CXX/drs/cwg158.cpp (+7-5)
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index c31579e8323174..73aae6b0db8d90 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -230,6 +230,12 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
               ->getString();
       TyName = Name;
     } else {
+      // Be conservative if the type isn't a record type. Handling other types
+      // may require stripping const-qualifiers inside the type, e.g.
+      // MemberPointerType.
+      if (!Ty->isRecordType())
+        return AnyPtr;
+
       // For non-builtin types use the mangled name of the canonical type.
       llvm::raw_svector_ostream TyOut(TyName);
       MangleCtx->mangleCanonicalTypeName(QualType(Ty, 0), TyOut);
diff --git a/clang/test/CXX/drs/cwg158.cpp b/clang/test/CXX/drs/cwg158.cpp
index 9301c790297e9d..d20f715d637917 100644
--- a/clang/test/CXX/drs/cwg158.cpp
+++ b/clang/test/CXX/drs/cwg158.cpp
@@ -1,12 +1,14 @@
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++98 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++11 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++14 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++98 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++11 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++14 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -pointer-tbaa -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
 
 // cwg158: yes
 
 // CHECK-LABEL: define {{.*}} @_Z1f
 const int *f(const int * const *p, int **q) {
+  // CHECK: load ptr, ptr %p.addr
   // CHECK: load ptr, {{.*}}, !tbaa ![[INTPTR_TBAA:[^,]*]]
   const int *x = *p;
   // CHECK: store ptr null, {{.*}}, !tbaa ![[INTPTR_TBAA]]
@@ -18,10 +20,10 @@ struct A {};
 
 // CHECK-LABEL: define {{.*}} @_Z1g
 const int *(A::*const *g(const int *(A::* const **p)[3], int *(A::***q)[3]))[3] {
+  // CHECK: load ptr, ptr %p.addr
   // CHECK: load ptr, {{.*}}, !tbaa ![[MEMPTR_TBAA:[^,]*]]
   const int *(A::*const *x)[3] = *p;
   // CHECK: store ptr null, {{.*}}, !tbaa ![[MEMPTR_TBAA]]
   *q = 0;
   return x;
 }
-

@llvmbot
Copy link
Member

llvmbot commented Nov 20, 2024

@llvm/pr-subscribers-clang

Author: Florian Hahn (fhahn)

Changes

Be conservative if the type isn't a record type. Handling other types may
require stripping const-qualifiers inside the type, e.g. MemberPointerType.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.


Full diff: https://github.com/llvm/llvm-project/pull/116991.diff

2 Files Affected:

  • (modified) clang/lib/CodeGen/CodeGenTBAA.cpp (+6)
  • (modified) clang/test/CXX/drs/cwg158.cpp (+7-5)
diff --git a/clang/lib/CodeGen/CodeGenTBAA.cpp b/clang/lib/CodeGen/CodeGenTBAA.cpp
index c31579e8323174..73aae6b0db8d90 100644
--- a/clang/lib/CodeGen/CodeGenTBAA.cpp
+++ b/clang/lib/CodeGen/CodeGenTBAA.cpp
@@ -230,6 +230,12 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
               ->getString();
       TyName = Name;
     } else {
+      // Be conservative if the type isn't a record type. Handling other types
+      // may require stripping const-qualifiers inside the type, e.g.
+      // MemberPointerType.
+      if (!Ty->isRecordType())
+        return AnyPtr;
+
       // For non-builtin types use the mangled name of the canonical type.
       llvm::raw_svector_ostream TyOut(TyName);
       MangleCtx->mangleCanonicalTypeName(QualType(Ty, 0), TyOut);
diff --git a/clang/test/CXX/drs/cwg158.cpp b/clang/test/CXX/drs/cwg158.cpp
index 9301c790297e9d..d20f715d637917 100644
--- a/clang/test/CXX/drs/cwg158.cpp
+++ b/clang/test/CXX/drs/cwg158.cpp
@@ -1,12 +1,14 @@
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++98 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++11 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++14 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
-// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++98 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++11 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++14 %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
+// RUN: %clang_cc1 -triple x86_64-linux -std=c++1z %s -O3 -pointer-tbaa -disable-llvm-passes -pedantic-errors -emit-llvm -o - | FileCheck --check-prefixes=CHECK %s
 
 // cwg158: yes
 
 // CHECK-LABEL: define {{.*}} @_Z1f
 const int *f(const int * const *p, int **q) {
+  // CHECK: load ptr, ptr %p.addr
   // CHECK: load ptr, {{.*}}, !tbaa ![[INTPTR_TBAA:[^,]*]]
   const int *x = *p;
   // CHECK: store ptr null, {{.*}}, !tbaa ![[INTPTR_TBAA]]
@@ -18,10 +20,10 @@ struct A {};
 
 // CHECK-LABEL: define {{.*}} @_Z1g
 const int *(A::*const *g(const int *(A::* const **p)[3], int *(A::***q)[3]))[3] {
+  // CHECK: load ptr, ptr %p.addr
   // CHECK: load ptr, {{.*}}, !tbaa ![[MEMPTR_TBAA:[^,]*]]
   const int *(A::*const *x)[3] = *p;
   // CHECK: store ptr null, {{.*}}, !tbaa ![[MEMPTR_TBAA]]
   *q = 0;
   return x;
 }
-

@rjmccall
Copy link
Contributor

Hmm. We're talking here about C++'s type similarity rules, which come from C++ [conv.qual]p1:

A cv-decomposition of a type T is a sequence of cv_i and P_i such that T is “cv_0 P_0 cv_1 P_1 ··· cv_n−1 P_n−1 cv_n U” for n ≥ 0, where each cv_i is a set of cv-qualifiers, and each P_i is “pointer to”, “pointer to member of class C_i of type”, “array of N_i”, or “array of unknown bound of”.

Okay, so we have pointers, member pointers, and arrays, and the upshot is that we need to be ignoring qualifiers on the pointee/element type recursively in all of these. In the current code, we're falling short of that in two ways:

  • we're completely not walking into member pointers
  • we're not recursively walking into arrays; we do look through them at the last step, but if it's an array of pointers, we won't keep applying the similarity rule to the pointee type

Your patch happens to conservatively fix both of these, but I think we should probably go ahead and fix the array issue now, and then maybe it's just member pointers that we need to bail out conservatively for. The right fix for the array issue is just that we need the loop to look through array structure at every step before it checks for another pointer type.

@fhahn fhahn force-pushed the pointer-tbaa-record-type-only branch from 378ac8e to f6adf00 Compare November 20, 2024 22:42
@@ -206,12 +206,14 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
if (!CodeGenOpts.PointerTBAA)
return AnyPtr;
// Compute the depth of the pointer and generate a tag of the form "p<depth>
// <base type tag>".
// <base type tag>". Look through pointer and array types to determine the
// base type.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion as a replacement for the comment:

    // C++ [basic.lval]p11 permits objects to accessed through an l-value
    // of similar type. Two types are similar under C++ [conv.qual]p2
    // if the decomposition of the types into pointers, member pointers,
    // and arrays has the same structure when ignoring cv-qualifiers at
    // each level of the decomposition. Meanwhile, C makes T(*)[] and
    // T(*)[N] compatible, which would really complicate any attempt to
    // distinguish pointers to arrays by their bounds. It's simpler, and
    // much easier to explain to users, to simply treat all pointers to
    // arrays as pointers to their element type for aliasing purposes.
    // So when creating a TBAA tag for a pointer type, we recursively
    // ignore both qualifiers and array types when decomposing
    // the pointee type. The only meaningful remaining structure is the
    // number of pointer types we encountered along the way, so we
    // just produce the tag "p<depth> <base type tag>". If we do find a
    // member pointer type, for now we just conservatively bail out
    // with AnyPtr (below) rather than trying to create a tag that
    // honors the similar-type rules while still distinguishing different
    // kinds of member pointer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced, thanks!

Ty = Ty->getPointeeType().getTypePtr();
} while (Ty->isPointerType());
Ty = Ty->isPointerType() ? Ty->getPointeeType().getTypePtr()
: Ty->getArrayElementTypeNoTypeQual();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just do Ty = Ty->getPointeeType()->getBaseElementTypeUnsafe();. The "unsafety" is just incorrectness about type qualifiers, which we're fine with here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

} while (Ty->isPointerType());
Ty = Ty->isPointerType() ? Ty->getPointeeType().getTypePtr()
: Ty->getArrayElementTypeNoTypeQual();
} while (Ty->isPointerType() || Ty->isArrayType());
Ty = Context.getBaseElementType(QualType(Ty, 0)).getTypePtr();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop this now.

// Be conservative if the type a MemberPointerType. Those would require
// stripping const-qualifiers inside the type.
if (Ty->isMemberPointerType())
return AnyPtr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, you were probably right the first time to bail out for all non-record types — if we run into a vector type, or an ObjC pointer type, or something like that, it'd be better to fall back on AnyPtr. Please leave a comment saying that we're specifically required to do this for member pointers until we implement the similar-types rule, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restored and added, thanks!

@fhahn fhahn force-pushed the pointer-tbaa-record-type-only branch from f6adf00 to 4e30502 Compare November 21, 2024 14:12
} while (Ty->isPointerType());
Ty = Context.getBaseElementType(QualType(Ty, 0)).getTypePtr();
Ty = Ty->getPointeeType()->getBaseElementTypeUnsafe();
} while (Ty->isPointerType() || Ty->isArrayType());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to check for array types anymore, getBaseElementTypeUnsafe() can never return an array type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, thanks.

@@ -230,6 +243,12 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) {
->getString();
TyName = Name;
} else {
// Be conservative if the type isn't a RecordType. We are specifically
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Be conservative if the type isn't a RecordType. We are specifically
// Be conservative if the type isn't a RecordType. We are specifically

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped, thanks

}

// POINTER-TBAA: [[PTRARRAY_TBAA]] = !{[[PTRARRAY_TY:!.+]], [[PTRARRAY_TY]], i64 0}
// POINTER-TBAA: [[PTRARRAY_TY]] = !{!"p3 int", !4, i64 0}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Why is this p3? I'd expect it to be p2 int: the l-value is of type int *(*)[], and we're ignoring all the array structure, so this should have the same tag as int **, i.e. p2 int, right?

Also, I assume the !4 is brittle. That's AnyPtr?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RUN line with -fpointer-tbaa was missing the POINTER-TBAA prefix, adjusted and fixed.

Be conservative if the type isn't a record type. Handling other types may
require stripping const-qualifiers inside the type, e.g. MemberPointerType.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.
@fhahn fhahn force-pushed the pointer-tbaa-record-type-only branch from 4e30502 to 2372e81 Compare November 21, 2024 18:47
@fhahn fhahn merged commit decb874 into llvm:main Nov 21, 2024
6 of 7 checks passed
@fhahn fhahn deleted the pointer-tbaa-record-type-only branch November 21, 2024 20:01
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 21, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory access
 * +4% more loops vectorized
 * +3% more stores removed by DSE.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
llvm#76612.

Support for user-defined types has been added in
llvm#110569.

There are 2 pending PRs with bug fixes for special cases uncovered
during some of my testing:
 * llvm#116991
 * llvm#116596
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 21, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory access
 * +4% more loops vectorized
 * +3% more stores removed by DSE.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
llvm#76612.

Support for user-defined types has been added in
llvm#110569.

There are 2 pending PRs with bug fixes for special cases uncovered
during some of my testing:
 * llvm#116991
 * llvm#116596
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 22, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory access
 * +4% more loops vectorized
 * +3% more stores removed by DSE.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
llvm#76612.

Support for user-defined types has been added in
llvm#110569.

There are 2 pending PRs with bug fixes for special cases uncovered
during some of my testing:
 * llvm#116991
 * llvm#116596
fhahn added a commit to fhahn/llvm-project that referenced this pull request Dec 3, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory access
 * +4% more loops vectorized
 * +3% more stores removed by DSE.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
llvm#76612.

Support for user-defined types has been added in
llvm#110569.

There are 2 pending PRs with bug fixes for special cases uncovered
during some of my testing:
 * llvm#116991
 * llvm#116596
fhahn added a commit that referenced this pull request Dec 4, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory accesses
 * +3% more stores removed by DSE,
 * +4% more loops vectorized.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
#76612.

Support for user-defined types has been added in
#110569.

There are 2 recent PRs with bug fixes for special cases uncovered during
testing:
 * #116991
 * #116596

PR: #117244
TIFitis pushed a commit to TIFitis/llvm-project that referenced this pull request Dec 18, 2024
Support for more precise TBAA metadata has been added a while ago
(behind the -fpointer-tbaa flag). The more precise TBAA metadata allows
treating accesses of different pointer types as no-alias.

This helps to remove more redundant loads and stores in a number of
workloads.

Some highlights on the impact across llvm-test-suite's MultiSource,
SPEC2006 & SPEC2017 include:
 * +2% more NoAlias results for memory accesses
 * +3% more stores removed by DSE,
 * +4% more loops vectorized.

This closes a relatively big gap to GCC, which has been supporting
disambiguating based on pointer types for a long time.
(https://clang.godbolt.org/z/K7Wbhrz4q)

Pointer-TBAA support for pointers to builtin types has been added in
llvm#76612.

Support for user-defined types has been added in
llvm#110569.

There are 2 recent PRs with bug fixes for special cases uncovered during
testing:
 * llvm#116991
 * llvm#116596

PR: llvm#117244
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 8, 2025
Be conservative if the type isn't a record type. Handling other types
may
require stripping const-qualifiers inside the type, e.g.
MemberPointerType.

Also look through array types same as through pointer types, to not
pessimize
arrays of pointers.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.

PR: llvm#116991
(cherry picked from commit decb874)
fhahn added a commit to fhahn/llvm-project that referenced this pull request Jan 9, 2025
Be conservative if the type isn't a record type. Handling other types
may
require stripping const-qualifiers inside the type, e.g.
MemberPointerType.

Also look through array types same as through pointer types, to not
pessimize
arrays of pointers.

Without this, we assign different tags to the accesses for p an q in the
second test in cwg158.

PR: llvm#116991
(cherry picked from commit decb874)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:codegen clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants