-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb] Improve unwinding for discontinuous functions #111409
Conversation
Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.
@llvm/pr-subscribers-lldb Author: Pavel Labath (labath) ChangesCurrently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios. Patch is 21.61 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111409.diff 6 Files Affected:
diff --git a/lldb/source/Commands/CommandObjectTarget.cpp b/lldb/source/Commands/CommandObjectTarget.cpp
index e950fb346c253b..9a4a17eaa124ce 100644
--- a/lldb/source/Commands/CommandObjectTarget.cpp
+++ b/lldb/source/Commands/CommandObjectTarget.cpp
@@ -3583,10 +3583,12 @@ class CommandObjectTargetModulesShowUnwind : public CommandObjectParsed {
addr_t start_addr = range.GetBaseAddress().GetLoadAddress(target);
if (abi)
start_addr = abi->FixCodeAddress(start_addr);
+ range.GetBaseAddress().SetLoadAddress(start_addr, target);
FuncUnwindersSP func_unwinders_sp(
sc.module_sp->GetUnwindTable()
- .GetUncachedFuncUnwindersContainingAddress(start_addr, sc));
+ .GetUncachedFuncUnwindersContainingAddress(range.GetBaseAddress(),
+ sc));
if (!func_unwinders_sp)
continue;
diff --git a/lldb/source/Symbol/UnwindTable.cpp b/lldb/source/Symbol/UnwindTable.cpp
index da88b0c9c4ea19..42ab7ba95b9e6c 100644
--- a/lldb/source/Symbol/UnwindTable.cpp
+++ b/lldb/source/Symbol/UnwindTable.cpp
@@ -99,12 +99,6 @@ UnwindTable::GetAddressRange(const Address &addr, const SymbolContext &sc) {
m_object_file_unwind_up->GetAddressRange(addr, range))
return range;
- // Check the symbol context
- if (sc.GetAddressRange(eSymbolContextFunction | eSymbolContextSymbol, 0,
- false, range) &&
- range.GetBaseAddress().IsValid())
- return range;
-
// Does the eh_frame unwind info has a function bounds for this addr?
if (m_eh_frame_up && m_eh_frame_up->GetAddressRange(addr, range))
return range;
@@ -113,6 +107,12 @@ UnwindTable::GetAddressRange(const Address &addr, const SymbolContext &sc) {
if (m_debug_frame_up && m_debug_frame_up->GetAddressRange(addr, range))
return range;
+ // Check the symbol context
+ if (sc.GetAddressRange(eSymbolContextFunction | eSymbolContextSymbol, 0,
+ false, range) &&
+ range.GetBaseAddress().IsValid())
+ return range;
+
return std::nullopt;
}
diff --git a/lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s b/lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s
new file mode 100644
index 00000000000000..c405e51c227cb6
--- /dev/null
+++ b/lldb/test/Shell/Unwind/Inputs/basic-block-sections-with-dwarf.s
@@ -0,0 +1,256 @@
+# An example of a function which has been split into two parts. Roughly
+# corresponds to this C code.
+# int baz() { return 47; }
+# int bar() { return foo(0); }
+# int foo(int flag) { return flag ? bar() : baz(); }
+# int main() { return foo(1); }
+# The function bar has been placed "in the middle" of foo.
+
+ .text
+
+ .type baz,@function
+baz:
+ .cfi_startproc
+ movl $47, %eax
+ retq
+ .cfi_endproc
+.Lbaz_end:
+ .size baz, .Lbaz_end-baz
+
+ .type foo,@function
+foo:
+ .cfi_startproc
+ pushq %rbp
+ .cfi_def_cfa_offset 16
+ .cfi_offset %rbp, -16
+ movq %rsp, %rbp
+ .cfi_def_cfa_register %rbp
+ subq $16, %rsp
+ movl %edi, -8(%rbp)
+ cmpl $0, -8(%rbp)
+ je foo.__part.2
+ jmp foo.__part.1
+ .cfi_endproc
+.Lfoo_end:
+ .size foo, .Lfoo_end-foo
+
+foo.__part.1:
+ .cfi_startproc
+ .cfi_def_cfa %rbp, 16
+ .cfi_offset %rbp, -16
+ callq bar
+ movl %eax, -4(%rbp)
+ jmp foo.__part.3
+.Lfoo.__part.1_end:
+ .size foo.__part.1, .Lfoo.__part.1_end-foo.__part.1
+ .cfi_endproc
+
+bar:
+ .cfi_startproc
+# NB: Decrease the stack pointer to make the unwind info for this function
+# different from the surrounding foo function.
+ subq $24, %rsp
+ .cfi_def_cfa_offset 32
+ xorl %edi, %edi
+ callq foo
+ addq $24, %rsp
+ .cfi_def_cfa %rsp, 8
+ retq
+ .cfi_endproc
+.Lbar_end:
+ .size bar, .Lbar_end-bar
+
+foo.__part.2:
+ .cfi_startproc
+ .cfi_def_cfa %rbp, 16
+ .cfi_offset %rbp, -16
+ callq baz
+ movl %eax, -4(%rbp)
+ jmp foo.__part.3
+.Lfoo.__part.2_end:
+ .size foo.__part.2, .Lfoo.__part.2_end-foo.__part.2
+ .cfi_endproc
+
+foo.__part.3:
+ .cfi_startproc
+ .cfi_def_cfa %rbp, 16
+ .cfi_offset %rbp, -16
+ movl -4(%rbp), %eax
+ addq $16, %rsp
+ popq %rbp
+ .cfi_def_cfa %rsp, 8
+ retq
+.Lfoo.__part.3_end:
+ .size foo.__part.3, .Lfoo.__part.3_end-foo.__part.3
+ .cfi_endproc
+
+
+ .globl main
+ .type main,@function
+main:
+ .cfi_startproc
+ movl $1, %edi
+ callq foo
+ retq
+ .cfi_endproc
+.Lmain_end:
+ .size main, .Lmain_end-main
+
+ .section .debug_abbrev,"",@progbits
+ .byte 1 # Abbreviation Code
+ .byte 17 # DW_TAG_compile_unit
+ .byte 1 # DW_CHILDREN_yes
+ .byte 37 # DW_AT_producer
+ .byte 8 # DW_FORM_string
+ .byte 19 # DW_AT_language
+ .byte 5 # DW_FORM_data2
+ .byte 17 # DW_AT_low_pc
+ .byte 1 # DW_FORM_addr
+ .byte 85 # DW_AT_ranges
+ .byte 35 # DW_FORM_rnglistx
+ .byte 116 # DW_AT_rnglists_base
+ .byte 23 # DW_FORM_sec_offset
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 2 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 0 # DW_CHILDREN_no
+ .byte 17 # DW_AT_low_pc
+ .byte 1 # DW_FORM_addr
+ .byte 18 # DW_AT_high_pc
+ .byte 1 # DW_FORM_addr
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 3 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 1 # DW_CHILDREN_yes
+ .byte 85 # DW_AT_ranges
+ .byte 35 # DW_FORM_rnglistx
+ .byte 64 # DW_AT_frame_base
+ .byte 24 # DW_FORM_exprloc
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 4 # Abbreviation Code
+ .byte 5 # DW_TAG_formal_parameter
+ .byte 0 # DW_CHILDREN_no
+ .byte 2 # DW_AT_location
+ .byte 24 # DW_FORM_exprloc
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 73 # DW_AT_type
+ .byte 19 # DW_FORM_ref4
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 5 # Abbreviation Code
+ .byte 36 # DW_TAG_base_type
+ .byte 0 # DW_CHILDREN_no
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 62 # DW_AT_encoding
+ .byte 11 # DW_FORM_data1
+ .byte 11 # DW_AT_byte_size
+ .byte 11 # DW_FORM_data1
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 0 # EOM(3)
+
+ .section .debug_info,"",@progbits
+.Lcu_begin0:
+ .long .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+ .short 5 # DWARF version number
+ .byte 1 # DWARF Unit Type
+ .byte 8 # Address Size (in bytes)
+ .long .debug_abbrev # Offset Into Abbrev. Section
+ .byte 1 # Abbrev [1] DW_TAG_compile_unit
+ .asciz "Hand-written DWARF" # DW_AT_producer
+ .short 29 # DW_AT_language
+ .quad 0 # DW_AT_low_pc
+ .byte 1 # DW_AT_ranges
+ .long .Lrnglists_table_base0 # DW_AT_rnglists_base
+ .byte 2 # Abbrev [2] DW_TAG_subprogram
+ .quad baz # DW_AT_low_pc
+ .quad .Lbaz_end # DW_AT_high_pc
+ .asciz "baz" # DW_AT_name
+ .byte 2 # Abbrev [2] DW_TAG_subprogram
+ .quad bar # DW_AT_low_pc
+ .quad .Lbar_end # DW_AT_high_pc
+ .asciz "bar" # DW_AT_name
+ .byte 3 # Abbrev [3] DW_TAG_subprogram
+ .byte 0 # DW_AT_ranges
+ .byte 1 # DW_AT_frame_base
+ .byte 86
+ .asciz "foo" # DW_AT_name
+ .byte 4 # Abbrev [4] DW_TAG_formal_parameter
+ .byte 2 # DW_AT_location
+ .byte 145
+ .byte 120
+ .asciz "flag" # DW_AT_name
+ .long .Lint-.Lcu_begin0 # DW_AT_type
+ .byte 0 # End Of Children Mark
+ .byte 2 # Abbrev [2] DW_TAG_subprogram
+ .quad main # DW_AT_low_pc
+ .quad .Lmain_end # DW_AT_high_pc
+ .asciz "main" # DW_AT_name
+.Lint:
+ .byte 5 # Abbrev [5] DW_TAG_base_type
+ .asciz "int" # DW_AT_name
+ .byte 5 # DW_AT_encoding
+ .byte 4 # DW_AT_byte_size
+ .byte 0 # End Of Children Mark
+.Ldebug_info_end0:
+
+ .section .debug_rnglists,"",@progbits
+ .long .Ldebug_list_header_end0-.Ldebug_list_header_start0 # Length
+.Ldebug_list_header_start0:
+ .short 5 # Version
+ .byte 8 # Address size
+ .byte 0 # Segment selector size
+ .long 2 # Offset entry count
+.Lrnglists_table_base0:
+ .long .Ldebug_ranges0-.Lrnglists_table_base0
+ .long .Ldebug_ranges1-.Lrnglists_table_base0
+.Ldebug_ranges0:
+ .byte 6 # DW_RLE_start_end
+ .quad foo
+ .quad .Lfoo_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.1
+ .quad .Lfoo.__part.1_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.2
+ .quad .Lfoo.__part.2_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.3
+ .quad .Lfoo.__part.3_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_ranges1:
+ .byte 6 # DW_RLE_start_end
+ .quad baz
+ .quad .Lbaz_end
+ .byte 6 # DW_RLE_start_end
+ .quad bar
+ .quad .Lbar_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.1
+ .quad .Lfoo.__part.1_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.2
+ .quad .Lfoo.__part.2_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo.__part.3
+ .quad .Lfoo.__part.3_end
+ .byte 6 # DW_RLE_start_end
+ .quad foo
+ .quad .Lfoo_end
+ .byte 6 # DW_RLE_start_end
+ .quad main
+ .quad .Lmain_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_list_header_end0:
+
+ .section ".note.GNU-stack","",@progbits
diff --git a/lldb/test/Shell/Unwind/Inputs/linux-x86_64.yaml b/lldb/test/Shell/Unwind/Inputs/linux-x86_64.yaml
new file mode 100644
index 00000000000000..987462c2a0efc6
--- /dev/null
+++ b/lldb/test/Shell/Unwind/Inputs/linux-x86_64.yaml
@@ -0,0 +1,26 @@
+--- !minidump
+Streams:
+ - Type: ThreadList
+ Threads:
+ - Thread Id: 0x000074DD
+ Context: 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000B0010000000000033000000000000000000000002020100000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000040109600000000000100000000000000000000000000000068E7D0C8FF7F000068E7D0C8FF7F000097E6D0C8FF7F000010109600000000000000000000000000020000000000000088E4D0C8FF7F0000603FFF85C77F0000F00340000000000080E7D0C8FF7F000000000000000000000000000000000000E0034000000000007F0300000000000000000000000000000000000000000000801F0000FFFF00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF252525252525252525252525252525250000000000000000000000000000000000000000000000000000000000000000FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000FF00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+ Stack:
+ Start of Memory Range: 0x00007FFFC8D0E000
+ Content: DEADBEEFBAADF00D
+ - Type: Exception
+ Thread ID: 0x000074DD
+ Exception Record:
+ Exception Code: 0x0000000B
+ Thread Context: 00000000
+ - Type: SystemInfo
+ Processor Arch: AMD64
+ Processor Level: 6
+ Processor Revision: 15876
+ Number of Processors: 40
+ Platform ID: Linux
+ CSD Version: 'Linux 3.13.0-91-generic'
+ CPU:
+ Vendor ID: GenuineIntel
+ Version Info: 0x00000000
+ Feature Info: 0x00000000
+...
diff --git a/lldb/test/Shell/Unwind/basic-block-sections-with-dwarf-static.test b/lldb/test/Shell/Unwind/basic-block-sections-with-dwarf-static.test
new file mode 100644
index 00000000000000..09ea0203e3eeae
--- /dev/null
+++ b/lldb/test/Shell/Unwind/basic-block-sections-with-dwarf-static.test
@@ -0,0 +1,65 @@
+# Test unwind info for functions which have been split into two or more parts.
+# In particular, check that the address ranges of these plans are correct, as
+# overly large ranges have caused a bug in the past.
+
+# REQUIRES: lld, target-x86_64
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj \
+# RUN: %S/Inputs/basic-block-sections-with-dwarf.s > %t.o
+# RUN: ld.lld %t.o -o %t
+## NB: This minidump exists only as a receptacle for the object file built
+## above. This is a workaround for the fact that "image show-unwind" does not
+## work without a Process object.
+# RUN: yaml2obj %S/Inputs/linux-x86_64.yaml > %t.core
+# RUN: %lldb -c %t.core %t -o "image load --file %t --slide 0" -s %s -o exit | \
+# RUN: FileCheck --implicit-check-not="UNWIND PLANS" %s
+
+image show-unwind -n foo
+# CHECK: UNWIND PLANS for {{.*}}`foo
+#
+# CHECK: Asynchronous (not restricted to call-sites) UnwindPlan is 'eh_frame CFI'
+# CHECK-NEXT: Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'
+
+# CHECK: Assembly language inspection UnwindPlan:
+# CHECK-NEXT: This UnwindPlan originally sourced from assembly insn profiling
+# CHECK-NEXT: This UnwindPlan is sourced from the compiler: no.
+# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: yes.
+# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
+# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 6-0x0000000000000019)
+
+# CHECK: eh_frame UnwindPlan:
+# CHECK-NEXT: This UnwindPlan originally sourced from eh_frame CFI
+# CHECK-NEXT: This UnwindPlan is sourced from the compiler: yes.
+# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: no.
+# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
+# CHECK-NEXT: Address range of this UnwindPlan: [{{.*}}.text + 6-0x0000000000000019)
+# CHECK-NEXT: row[0]: 0: CFA=rsp +8 => rip=[CFA-8]
+# CHECK-NEXT: row[1]: 1: CFA=rsp+16 => rbp=[CFA-16] rip=[CFA-8]
+# CHECK-NEXT: row[2]: 4: CFA=rbp+16 => rbp=[CFA-16] rip=[CFA-8]
+# CHECK-EMPTY:
+
+image show-unwind -n foo.__part.1
+# CHECK: UNWIND PLANS for {{.*}}`foo.__part.1
+
+## As of this writing (Oct 2024), the async unwind plan is "assembly insn
+## profiling", which isn't ideal, because this "function" does not follow the
+## standard ABI. We end up choosing this plan because the eh_frame unwind plan
+## looks like the unwind plan for a regular function without the prologue
+## information.
+# CHECK: Synchronous (restricted to call-sites) UnwindPlan is 'eh_frame CFI'
+
+# CHECK: Assembly language inspection UnwindPlan:
+# CHECK-NEXT: This UnwindPlan originally sourced from assembly insn profiling
+# CHECK-NEXT: This UnwindPlan is sourced from the compiler: no.
+# CHECK-NEXT: This UnwindPlan is valid at all instruction locations: yes.
+# CHECK-NEXT: This UnwindPlan is for a trap handler function: no.
+# CHECK-NEXT: ...
[truncated]
|
|
||
FuncUnwindersSP func_unwinders_sp( | ||
sc.module_sp->GetUnwindTable() | ||
.GetUncachedFuncUnwindersContainingAddress(start_addr, sc)); | ||
.GetUncachedFuncUnwindersContainingAddress(range.GetBaseAddress(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because image show-unwind
wouldn't use the eh_frame address range without it (it requires a resolved address).
BTW. This would have been easier to investigate (and test) if this command used the actual cached unwind plans instead of creating them from scratch. This way I was left puzzled as to unwind went off track even though the unwind plans looked (mostly) reasonable. What would you say to:
a) adding an option to use/print the cached plans
b) making this option the default
c) deleting the non-cached option entirely?
This is interesting, sorry the uncached unwindplans made it harder to debug. I've seen cold function outlining in our libraries but I've never looked at the generated DWARF - does the function die have a location list with the main function body and the .cold.1 function body as entries, discontiguous? I'll try to scrub around and find one of ours and see what our DWARF looks like tomorrow on a Darwin binary that does this, curious now. From a symbol table point of view, if the symbol names haven't been stripped, the function symbol will have the range of the main function body only; the .cold.1 function would be a separate Symbol. In a stripped binary (no symbol name), I know ObjectFileMachO will use the eh_frame start addresses to create fake symbol table names & entries, I don't know if ObjectFileELF does that, but in that case a Symbol would be equivalent to eh_frame as a source of information. As for why the image show-unwind is using the uncached version, it's a little hack to aid in debugging. You can turn on the verbose unwind log and do a show-unwind to see all the logging from UnwindAssemblyInstEmulation as it goes over the instructions. With the cached version, it wouldn't re-do this work so you'd miss it unless you turned on the verbose log at the beginning of the process execution (and skipped a lot of other functions that you're not interested in). You can adjust the logging level and re-do the command and it re-calculates it again, it's a bit handy like this. I'd like to take a peek at the DWARF for a binary we have using this codegen out of curiosity, but the patch looks fine to me. It's clearly not great that this is this fragile, and I'd like to reconsider how these ranges are calculated so it's clearer what is intended. I wonder if the Dwarin debug info reflects these as separate functions, or maybe I've just gotten lucky that I never looked too closely at one. |
Let me just take a peek at some Darwin binaries with this codegen & their debuginfo, and then i'll approve unless there's something I've misunderstood about what we're looking at. It really makes me think that we are going to need discontiguous ranges in the unwind plans to properly handle this kind of program. |
Thanks for the quick response, Jason. I think it's quite possible that you haven't run into this situation before, because the final representation depends on what tool was used to split the functions. Judging by the name ( The thing that produces this output is the The problem begins in I've considered (and that's something I'd like to do independently of this patch) changing the
ObjectFileELF does that too. The problems begin only when debug info is present because lldb_private::Function will claim the maximal range, as I've described above. However, this creation of synthetic symbols from unwind info is perhaps a good reason for why changing the order of range sources is safe(ish). (Although it could be a reason for not querying the eh_frame for range information at all) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the background here, I can see what a problem a function with discontiguous ranges can be for some of our data structures. Looking at the assembly in your test case (which is going to be really helpful if we try to tackle this more accurately in the future), I see that the instruction scanner is also going to fail for the cold parts of the functions, which are tail-called (JMP'ed) to, if it tried to analyze the prologue setup based on the instructions in them. The eh_frame treats them as separate functions and hard-codes the correct stack setup from the base function in the cold blocks of code, that's one way to cheat this kind of thing, treating them as separate functions that have only contiguous address ranges.
This looks good to me for now, we'll probably need to handle this more thoroughly in the future not only in the unwind plans. :/
Yeah, I plan to look into Function::GetAddressRange and see how we can make it do something reasonable for these kinds of functions (probably by having it return a list of ranges). That will likely involve looking at the users of that API as well, although I'm less than enthusiastic about making this work with the instruction scanner -- the reason for that (at least for our use case) I don't see much use for it. We already have tools (like profilers) which rely on the correctness of unwind info (even in the async case), so I don't think we gain much by using it. Clang already produces correct eh_frame for these discontinuous functions (for x86 at least, we're in the process of fixing aarch64), and hand-written functions are not likely to be split, and even if they are, unless the user writes correct debug info for them, we won't be able to detect that they are split. At that point the user might as well write the eh_frame section as well... |
Hi this patch broke Unwind/trap_frame_sym_ctx.test in greendragon, you can see the link to the failing build here: I will be reverting this change to make sure green dragon is up and running again |
This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame #2: {{.*}}`main
I'm sorry for breaking the bot. I'm afraid I'm going to need some help to understand to problem though. This test is x86 specific, and I don't have access to an x86 mac machine (only arm), and the test works fine on linux. Could you (or whoever is responsible for that bot) provide some details on the failure? The undwind log ( |
Alex had a PR a month ago #106791 that got reverted from the same test failing on x86, I tried to repo the failure a few times on an intel mac and never succeeded. I'll try again with your patch when I have that set up. If this test didn't start/stop failing so clearly with these commits, I'd be very dubious of the test itself, to be honest. |
Thanks Jason. I have no doubt that the failure was triggered by this patch (the test tests unwinding, the patch touches unwinding), but the mechanism itself is very unclear to me. The functions in that patch have a valid eh_frame (and they don't have debug info), so I would expect that a change like this would be a noop in this case. I suspect there's something subtle going on, and that this could very well be a bug in some other lldb component, but it's hard tell what that is without more info. It could also easily depend on some system code that makes its way into the debugged process (process startup code and stuff). If you're unable to reproduce this on your machine, maybe we'll have to modify the test to dump loads of extra output to make sense of it. |
Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.
…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame llvm#2: {{.*}}`main
I was able to build github main on an Intel mac running the current OS, and the shell tests all pass with it. I had the same difficulty trying to repo a failure with this test with Alex's change, I could never get the failure the CI bots were hitting, but I didn't have the exact same macOS version as the CI bots installed, shrug maybe that's important. I'll try to figure that out and install it on an intel mac and try to repo that way, I really wanted to get back to Alex's change too. |
(wrote that update too quickly while running out the door -- I can run the lldb-shell tests both with and without your patch, and they pass. I'm not able to repo the failure the CI bots saw.) |
Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.
…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame llvm#2: {{.*}}`main
FTR I have an intel mac running the same OS as the CI bots ( |
Thank you for looking into this. |
Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt. While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind. We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous). With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.
…1409)" This reverts commit a89e016. This is being reverted because it broke the test: Unwind/trap_frame_sym_ctx.test /Users/ec2-user/jenkins/workspace/llvm.org/lldb-cmake/llvm-project/lldb/test/Shell/Unwind/trap_frame_sym_ctx.test:21:10: error: CHECK: expected string not found in input CHECK: frame #2: {{.*}}`main
Currently, our unwinder assumes that the functions are continuous (or at least, that there are no functions which are "in the middle" of other functions). Neither of these assumptions is true for functions optimized by tools like propeller and (probably) bolt.
While there are many things that go wrong for these functions, the biggest damage is caused by the unwind plan caching code, which currently takes the maximalist extent of the function and assumes that the unwind plan we get for that is going to be valid for all code inside that range. If a part of the function has been moved into a "cold" section, then the range of the function can be many megabytes, meaning that any function within that range will probably fail to unwind.
We end up with this maximalist range because the unwinder asks for the Function object for its range. This is only one of the strategies for determining the range, but it is the first one -- and also the most incorrect one. The second choice would is asking the eh_frame section for the range of the function, and this one returns something reasonable here (the address range of the current function fragment) -- which it does because each fragment gets its own eh_frame entry (it has to, because they have to be continuous).
With this in mind, this patch moves the eh_frame (and debug_frame) to the front of the queue. I think that preferring this range makes sense because eh_frame is one of the unwind plans that we return, and some others (augmented eh_frame) are based on it. In theory this could break some functions, where the debug info and eh_frame disagree on the extent of the function (and eh_frame is the one who's wrong), but I don't know of any such scenarios.