Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lldb][DWARFASTParserClang] Check DW_AT_declaration to determine static data members #68300

Merged
merged 2 commits into from
Oct 6, 2023

Conversation

Michael137
Copy link
Member

@Michael137 Michael137 commented Oct 5, 2023

Background

Prior to DWARFv4, there was no clear normative text on how to handle static data members. Non-normative text suggested that compilers should use DW_AT_external to mark static data members of structrues/unions. Clang does this consistently. However, GCC doesn't, e.g., when the structure/union is in an anonymous namespace (which is C++ standard conformant). Additionally, GCC never emits DW_AT_data_member_locations for union members (regardless of storage linkage and storage duration).

Since DWARFv5 (issue 161118.1), static data members get emitted as DW_TAG_variable.

LLDB used to differentiate between static and non-static members by checking the DW_AT_external flag and the absence of DW_AT_data_member_location. With D18008 LLDB started to pretend that union members always have a 0 DW_AT_data_member_location by default (because GCC never emits these locations).

In D124409 LLDB stopped checking the DW_AT_external flag to account for the case where GCC doesn't emit the flag for types in anonymous namespaces; instead we only check for presence of DW_AT_data_member_locations.

The combination of these changes then meant that LLDB would never correctly detect that a union has static data members.

Solution

Instead of unconditionally initializing the member_byte_offset to 0 specifically for union members, this patch proposes to check for both the absence of DW_AT_data_member_location and DW_AT_declaration, which consistently gets emitted for static data members on GCC and Clang.

We initialize the member_byte_offset to 0 anyway if we determine it wasn't a static. So removing the special case for unions makes this code simpler to reason about.

Long-term, we should just use DWARFv5's new representation for static data members.

Fixes #68135

…ic data members

**Background**

Prior to DWARFv4, there was no clear normative text on how to handle
static data members. Non-normative text suggested we compilers should
use `DW_AT_external` to mark static data members of structrues/unions.
Clang does this consistently. However, GCC doesn't, e.g., when the
structure/union is in an anonymous namespace (which is C++ standard
conformant). Additionally, GCC never emits `DW_AT_data_member_location`s
for union members (regardless of storage linkage and storage duration).

Since DWARFv5 (issue 161118.1), static data members get emitted as
`DW_TAG_variable`.

LLDB used to differentiate between static and non-static members by
checking the `DW_AT_external` flag and the absence of
`DW_AT_data_member_location`. With D18008 LLDB started to pretend
that union members always have a `0` `DW_AT_data_member_location`
by default (because GCC never emits these locations).

In D124409 LLDB stopped checking the `DW_AT_external` flag to account
for the case where GCC doesn't emit the flag for types in anonymous
namespaces; instead we only check for presence of
`DW_AT_data_member_location`s.

The combination of these changes then meant that LLDB would never
correctly detect that a union has static data members.

**Solution**

Instead of unconditionally initializing the `member_byte_offset` to `0`
specifically for union members, this patch proposes to check for both
the absence of `DW_AT_data_member_location` and `DW_AT_declaration`,
which consistently gets emitted for static data members on GCC and
Clang.

We initialize the `member_byte_offset` to `0` anyway if we determine
it wasn't a static. So removing the special case for unions makes this
code simpler to reason about.

Long-term, we should just use DWARFv5's new representation for static
data members.

Fixes llvm#68135
@llvmbot llvmbot added the lldb label Oct 5, 2023
@Michael137
Copy link
Member Author

Alternatively, we could start checking DW_AT_external again, at the cost of not supporting some GCC cases pre-DWARFv5

@llvmbot
Copy link
Member

llvmbot commented Oct 5, 2023

@llvm/pr-subscribers-lldb

Changes

Background

Prior to DWARFv4, there was no clear normative text on how to handle static data members. Non-normative text suggested we compilers should use DW_AT_external to mark static data members of structrues/unions. Clang does this consistently. However, GCC doesn't, e.g., when the structure/union is in an anonymous namespace (which is C++ standard conformant). Additionally, GCC never emits DW_AT_data_member_locations for union members (regardless of storage linkage and storage duration).

Since DWARFv5 (issue 161118.1), static data members get emitted as DW_TAG_variable.

LLDB used to differentiate between static and non-static members by checking the DW_AT_external flag and the absence of DW_AT_data_member_location. With D18008 LLDB started to pretend that union members always have a 0 DW_AT_data_member_location by default (because GCC never emits these locations).

In D124409 LLDB stopped checking the DW_AT_external flag to account for the case where GCC doesn't emit the flag for types in anonymous namespaces; instead we only check for presence of
DW_AT_data_member_locations.

The combination of these changes then meant that LLDB would never correctly detect that a union has static data members.

Solution

Instead of unconditionally initializing the member_byte_offset to 0 specifically for union members, this patch proposes to check for both the absence of DW_AT_data_member_location and DW_AT_declaration, which consistently gets emitted for static data members on GCC and Clang.

We initialize the member_byte_offset to 0 anyway if we determine it wasn't a static. So removing the special case for unions makes this code simpler to reason about.

Long-term, we should just use DWARFv5's new representation for static data members.

Fixes #68135


Full diff: https://github.com/llvm/llvm-project/pull/68300.diff

4 Files Affected:

  • (modified) lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp (+16-6)
  • (added) lldb/test/API/lang/cpp/union-static-data-members/Makefile (+3)
  • (added) lldb/test/API/lang/cpp/union-static-data-members/TestCppUnionStaticMembers.py (+43)
  • (added) lldb/test/API/lang/cpp/union-static-data-members/main.cpp (+25)
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
index 37fb16d4e0351c9..ee35a7de80c1e18 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
@@ -2482,8 +2482,9 @@ struct MemberAttributes {
   DWARFFormValue encoding_form;
   /// Indicates the byte offset of the word from the base address of the
   /// structure.
-  uint32_t member_byte_offset;
+  uint32_t member_byte_offset = UINT32_MAX;
   bool is_artificial = false;
+  bool is_declaration = false;
 };
 
 /// Parsed form of all attributes that are relevant for parsing Objective-C
@@ -2656,8 +2657,6 @@ DiscriminantValue &VariantPart::discriminant() { return this->_discriminant; }
 MemberAttributes::MemberAttributes(const DWARFDIE &die,
                                    const DWARFDIE &parent_die,
                                    ModuleSP module_sp) {
-  member_byte_offset = (parent_die.Tag() == DW_TAG_union_type) ? 0 : UINT32_MAX;
-
   DWARFAttributes attributes = die.GetAttributes();
   for (size_t i = 0; i < attributes.Size(); ++i) {
     const dw_attr_t attr = attributes.AttributeAtIndex(i);
@@ -2717,6 +2716,9 @@ MemberAttributes::MemberAttributes(const DWARFDIE &die,
       case DW_AT_artificial:
         is_artificial = form_value.Boolean();
         break;
+      case DW_AT_declaration:
+        is_declaration = form_value.Boolean();
+        break;
       default:
         break;
       }
@@ -2923,10 +2925,18 @@ void DWARFASTParserClang::ParseSingleMember(
   if (class_is_objc_object_or_interface)
     attrs.accessibility = eAccessNone;
 
-  // Handle static members, which is any member that doesn't have a bit or a
-  // byte member offset.
+  // Handle static members, which are typically members without
+  // locations. However, GCC *never* emits DW_AT_data_member_location
+  // for static data members of unions.
+  // Non-normative text pre-DWARFv5 recommends marking static
+  // data members with an DW_AT_external flag. Clang emits this consistently
+  // whereas GCC emits it only for static data members if not part of an
+  // anonymous namespace. The flag that is consistently emitted for static
+  // data members is DW_AT_declaration, so we check it instead.
+  // FIXME: Since DWARFv5, static data members are marked DW_AT_variable so we can
+  // consistently detect them on both GCC and Clang without below heuristic.
   if (attrs.member_byte_offset == UINT32_MAX &&
-      attrs.data_bit_offset == UINT64_MAX) {
+      attrs.data_bit_offset == UINT64_MAX && attrs.is_declaration) {
     Type *var_type = die.ResolveTypeUID(attrs.encoding_form.Reference());
 
     if (var_type) {
diff --git a/lldb/test/API/lang/cpp/union-static-data-members/Makefile b/lldb/test/API/lang/cpp/union-static-data-members/Makefile
new file mode 100644
index 000000000000000..99998b20bcb0502
--- /dev/null
+++ b/lldb/test/API/lang/cpp/union-static-data-members/Makefile
@@ -0,0 +1,3 @@
+CXX_SOURCES := main.cpp
+
+include Makefile.rules
diff --git a/lldb/test/API/lang/cpp/union-static-data-members/TestCppUnionStaticMembers.py b/lldb/test/API/lang/cpp/union-static-data-members/TestCppUnionStaticMembers.py
new file mode 100644
index 000000000000000..47166636b12647c
--- /dev/null
+++ b/lldb/test/API/lang/cpp/union-static-data-members/TestCppUnionStaticMembers.py
@@ -0,0 +1,43 @@
+"""
+Tests that frame variable and expr work for
+C++ unions and their static data members.
+"""
+import lldb
+from lldbsuite.test.lldbtest import *
+from lldbsuite.test.decorators import *
+import lldbsuite.test.lldbutil as lldbutil
+
+class CppUnionStaticMembersTestCase(TestBase):
+    def test(self):
+        """Tests that frame variable and expr work
+           for union static data members"""
+        self.build()
+
+        (target, process, main_thread, _) = lldbutil.run_to_source_breakpoint(
+            self, "return 0", lldb.SBFileSpec("main.cpp")
+        )                                                                     
+
+        self.expect("frame variable foo", substrs=["val = 42"])
+        self.expect("frame variable bar", substrs=["val = 137"])
+
+        self.expect_expr("foo", result_type="Foo", result_children=[ValueCheck(
+                name="val", value="42"
+            )])
+        self.expect_expr("bar", result_type="Bar", result_children=[ValueCheck(
+                name="val", value="137"
+            )])
+
+        self.expect_expr("Foo::sVal1", result_type="const int", result_value="-42")
+        self.expect_expr("Foo::sVal2", result_type="Foo", result_children=[ValueCheck(
+                name="val", value="42"
+            )])
+
+    @expectedFailureAll
+    def test_union_in_anon_namespace(self):
+        """Tests that frame variable and expr work
+           for union static data members in anonymous
+           namespaces"""
+        self.expect_expr("Bar::sVal1", result_type="const int", result_value="-137")
+        self.expect_expr("Bar::sVal2", result_type="Bar", result_children=[ValueCheck(
+                name="val", value="137"
+            )])
diff --git a/lldb/test/API/lang/cpp/union-static-data-members/main.cpp b/lldb/test/API/lang/cpp/union-static-data-members/main.cpp
new file mode 100644
index 000000000000000..8ba0312cd3a618b
--- /dev/null
+++ b/lldb/test/API/lang/cpp/union-static-data-members/main.cpp
@@ -0,0 +1,25 @@
+union Foo {
+  int val = 42;
+  static const int sVal1 = -42;
+  static Foo sVal2;
+};
+
+Foo Foo::sVal2{};
+
+namespace {
+union Bar {
+  int val = 137;
+  static const int sVal1 = -137;
+  static Bar sVal2;
+};
+
+Bar Bar::sVal2{};
+} // namespace
+
+int main() {
+  Foo foo;
+  Bar bar;
+  auto sum = Bar::sVal1 + Foo::sVal1 + Foo::sVal2.val + Bar::sVal2.val;
+
+  return 0;
+}

@github-actions
Copy link

github-actions bot commented Oct 5, 2023

✅ With the latest revision this PR passed the C/C++ code formatter.

@Michael137 Michael137 requested a review from dwblaikie October 5, 2023 13:01
@Michael137 Michael137 merged commit f74aaca into llvm:main Oct 6, 2023
@Michael137 Michael137 deleted the bugfix/lldb-union-static-members branch October 6, 2023 08:08
Michael137 added a commit to Michael137/llvm-project that referenced this pull request Oct 6, 2023
…ic data members (llvm#68300)

**Background**

Prior to DWARFv4, there was no clear normative text on how to handle
static data members. Non-normative text suggested that compilers should
use `DW_AT_external` to mark static data members of structrues/unions.
Clang does this consistently. However, GCC doesn't, e.g., when the
structure/union is in an anonymous namespace (which is C++ standard
conformant). Additionally, GCC never emits `DW_AT_data_member_location`s
for union members (regardless of storage linkage and storage duration).

Since DWARFv5 (issue 161118.1), static data members get emitted as
`DW_TAG_variable`.

LLDB used to differentiate between static and non-static members by
checking the `DW_AT_external` flag and the absence of
`DW_AT_data_member_location`. With
[D18008](https://reviews.llvm.org/D18008) LLDB started to pretend that
union members always have a `0` `DW_AT_data_member_location` by default
(because GCC never emits these locations).

In [D124409](https://reviews.llvm.org/D124409) LLDB stopped checking the
`DW_AT_external` flag to account for the case where GCC doesn't emit the
flag for types in anonymous namespaces; instead we only check for
presence of `DW_AT_data_member_location`s.

The combination of these changes then meant that LLDB would never
correctly detect that a union has static data members.

**Solution**

Instead of unconditionally initializing the `member_byte_offset` to `0`
specifically for union members, this patch proposes to check for both
the absence of `DW_AT_data_member_location` and `DW_AT_declaration`,
which consistently gets emitted for static data members on GCC and
Clang.

We initialize the `member_byte_offset` to `0` anyway if we determine it
wasn't a static. So removing the special case for unions makes this code
simpler to reason about.

Long-term, we should just use DWARFv5's new representation for static
data members.

Fixes llvm#68135

(cherry picked from commit f74aaca)
Michael137 added a commit to Michael137/llvm-project that referenced this pull request Oct 6, 2023
…ic data members (llvm#68300)

**Background**

Prior to DWARFv4, there was no clear normative text on how to handle
static data members. Non-normative text suggested that compilers should
use `DW_AT_external` to mark static data members of structrues/unions.
Clang does this consistently. However, GCC doesn't, e.g., when the
structure/union is in an anonymous namespace (which is C++ standard
conformant). Additionally, GCC never emits `DW_AT_data_member_location`s
for union members (regardless of storage linkage and storage duration).

Since DWARFv5 (issue 161118.1), static data members get emitted as
`DW_TAG_variable`.

LLDB used to differentiate between static and non-static members by
checking the `DW_AT_external` flag and the absence of
`DW_AT_data_member_location`. With
[D18008](https://reviews.llvm.org/D18008) LLDB started to pretend that
union members always have a `0` `DW_AT_data_member_location` by default
(because GCC never emits these locations).

In [D124409](https://reviews.llvm.org/D124409) LLDB stopped checking the
`DW_AT_external` flag to account for the case where GCC doesn't emit the
flag for types in anonymous namespaces; instead we only check for
presence of `DW_AT_data_member_location`s.

The combination of these changes then meant that LLDB would never
correctly detect that a union has static data members.

**Solution**

Instead of unconditionally initializing the `member_byte_offset` to `0`
specifically for union members, this patch proposes to check for both
the absence of `DW_AT_data_member_location` and `DW_AT_declaration`,
which consistently gets emitted for static data members on GCC and
Clang.

We initialize the `member_byte_offset` to `0` anyway if we determine it
wasn't a static. So removing the special case for unions makes this code
simpler to reason about.

Long-term, we should just use DWARFv5's new representation for static
data members.

Fixes llvm#68135

(cherry picked from commit f74aaca)
@antmox
Copy link
Contributor

antmox commented Oct 6, 2023

Hello! It looks like this broke lldb-aarch64-windows bot:
https://lab.llvm.org/buildbot/#/builders/219/builds/6130
Could you please look at this ?

@Michael137
Copy link
Member Author

Hello! It looks like this broke lldb-aarch64-windows bot: https://lab.llvm.org/buildbot/#/builders/219/builds/6130 Could you please look at this ?

Yup! Was in the process of fixing. I think we just skip this test on windows

@Michael137
Copy link
Member Author

@antmox Addressed in #68408

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[lldb][Expression] Printing union with self-referencing member field crashes lldb
4 participants