-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lld/ELF] Place large executable sections at the beginning #70358
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-elf Author: Arthur Eubanks (aeubanks) ChangesSo that when mixing small and large text, large text stays out of the way of the rest of the binary. Full diff: https://github.com/llvm/llvm-project/pull/70358.diff 2 Files Affected:
diff --git a/lld/ELF/Writer.cpp b/lld/ELF/Writer.cpp
index 57e1aa06c6aa873..313ff0f872e9440 100644
--- a/lld/ELF/Writer.cpp
+++ b/lld/ELF/Writer.cpp
@@ -902,11 +902,12 @@ enum RankFlags {
RF_NOT_ALLOC = 1 << 26,
RF_PARTITION = 1 << 18, // Partition number (8 bits)
RF_NOT_SPECIAL = 1 << 17,
- RF_WRITE = 1 << 16,
- RF_EXEC_WRITE = 1 << 15,
- RF_EXEC = 1 << 14,
- RF_RODATA = 1 << 13,
- RF_LARGE = 1 << 12,
+ RF_LARGE_TEXT = 1 << 16,
+ RF_WRITE = 1 << 15,
+ RF_EXEC_WRITE = 1 << 14,
+ RF_EXEC = 1 << 13,
+ RF_RODATA = 1 << 12,
+ RF_LARGE = 1 << 11,
RF_NOT_RELRO = 1 << 9,
RF_NOT_TLS = 1 << 8,
RF_BSS = 1 << 7,
@@ -957,6 +958,7 @@ static unsigned getSectionRank(OutputSection &osec) {
// places.
bool isExec = osec.flags & SHF_EXECINSTR;
bool isWrite = osec.flags & SHF_WRITE;
+ bool isLarge = osec.flags & SHF_X86_64_LARGE && config->emachine == EM_X86_64;
if (!isWrite && !isExec) {
// Make PROGBITS sections (e.g .rodata .eh_frame) closer to .text to
@@ -965,10 +967,13 @@ static unsigned getSectionRank(OutputSection &osec) {
if (osec.type == SHT_PROGBITS)
rank |= RF_RODATA;
// Among PROGBITS sections, place .lrodata further from .text.
- if (!(osec.flags & SHF_X86_64_LARGE && config->emachine == EM_X86_64))
+ if (!isLarge)
rank |= RF_LARGE;
} else if (isExec) {
rank |= isWrite ? RF_EXEC_WRITE : RF_EXEC;
+ // Place .ltext after all other not special sections.
+ if (isLarge)
+ rank |= RF_LARGE_TEXT;
} else {
rank |= RF_WRITE;
// The TLS initialization block needs to be a single contiguous block. Place
@@ -981,7 +986,7 @@ static unsigned getSectionRank(OutputSection &osec) {
rank |= RF_NOT_RELRO;
// Place .ldata and .lbss after .bss. Making .bss closer to .text alleviates
// relocation overflow pressure.
- if (osec.flags & SHF_X86_64_LARGE && config->emachine == EM_X86_64)
+ if (isLarge)
rank |= RF_LARGE;
}
diff --git a/lld/test/ELF/x86-64-section-layout.s b/lld/test/ELF/x86-64-section-layout.s
index 37201279fa0a5d0..5ae24f132f9f7b9 100644
--- a/lld/test/ELF/x86-64-section-layout.s
+++ b/lld/test/ELF/x86-64-section-layout.s
@@ -30,7 +30,8 @@
# CHECK-NEXT: .ldata PROGBITS 0000000000205b07 000b07 000002 00 WAl 0 0 1
# CHECK-NEXT: .ldata2 PROGBITS 0000000000205b09 000b09 000001 00 WAl 0 0 1
# CHECK-NEXT: .lbss NOBITS 0000000000205b0a 000b0a 000002 00 WAl 0 0 1
-# CHECK-NEXT: .comment PROGBITS 0000000000000000 000b0a {{.*}} 01 MS 0 0 1
+# CHECK-NEXT: .ltext PROGBITS 0000000000206b0c 000b0c 000001 00 AXl 0 0 1
+# CHECK-NEXT: .comment PROGBITS 0000000000000000 000b0d {{.*}} 01 MS 0 0 1
# CHECK: Program Headers:
# CHECK-NEXT: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
@@ -44,7 +45,8 @@
# CHECK1: .data PROGBITS 0000000000203306 000306 000001 00 WA 0 0 1
# CHECK1-NEXT: .ldata PROGBITS 0000000000203307 000307 000002 00 WAl 0 0 1
# CHECK1-NEXT: .ldata2 PROGBITS 0000000000203309 000309 000001 00 WAl 0 0 1
-# CHECK1-NEXT: .comment PROGBITS 0000000000000000 00030a {{.*}} 01 MS 0 0 1
+# CHECK1-NEXT: .ltext PROGBITS 000000000020430a 00030a 000001 00 AXl 0 0 1
+# CHECK1-NEXT: .comment PROGBITS 0000000000000000 00030b {{.*}} 01 MS 0 0 1
# CHECK2: .note NOTE 0000000000200300 000300 000001 00 A 0 0 1
# CHECK2-NEXT: .lrodata PROGBITS 0000000000200301 000301 000001 00 Al 0 0 1
@@ -60,7 +62,8 @@
# CHECK2-NEXT: .ldata PROGBITS 0000000000201b07 001b07 000002 00 WAl 0 0 1
# CHECK2-NEXT: .ldata2 PROGBITS 0000000000201b09 001b09 000001 00 WAl 0 0 1
# CHECK2-NEXT: .lbss NOBITS 0000000000201b0a 001b0a 000002 00 WAl 0 0 1
-# CHECK2-NEXT: .comment PROGBITS 0000000000000000 001b0a {{.*}} 01 MS 0 0 1
+# CHECK2-NEXT: .ltext PROGBITS 0000000000201b0c 001b0c 000001 00 AXl 0 0 1
+# CHECK2-NEXT: .comment PROGBITS 0000000000000000 001b0d {{.*}} 01 MS 0 0 1
# CHECK2: Program Headers:
# CHECK2-NEXT: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align
@@ -68,6 +71,7 @@
# CHECK2-NEXT: LOAD 0x000000 0x0000000000200000 0x0000000000200000 0x000304 0x000304 R 0x1000
# CHECK2-NEXT: LOAD 0x000304 0x0000000000200304 0x0000000000200304 0x000001 0x000001 R E 0x1000
# CHECK2-NEXT: LOAD 0x000305 0x0000000000200305 0x0000000000200305 0x001805 0x001807 RW 0x1000
+# CHECK2-NEXT: LOAD 0x001b0c 0x0000000000201b0c 0x0000000000201b0c 0x000001 0x000001 R E 0x1000
# CHECK2-NEXT: TLS 0x000305 0x0000000000200305 0x0000000000200305 0x000001 0x000003 R 0x1
#--- a.s
@@ -75,6 +79,7 @@
_start:
ret
+.section .ltext,"axl",@progbits; .space 1
.section .note,"a",@note; .space 1
.section .rodata,"a",@progbits; .space 1
.section .data,"aw",@progbits; .space 1
|
The x86-64 ABI mentions I believe
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see main comment
I'd consider that we place |
So that when mixing small and large text, large text stays out of the way of the rest of the binary.
it's fine if .ltext is far away from .got, it uses a 64-bit relocation to access it I agree that partitioned executables would be nice, but in cases where it's prohibitively expensive to change the deployment model from one main executable to multiple files, this is a nice way of getting things to link if we don't care about performance while also handling precompiled libraries. we can (and do want to) explore more efficient ways to work around the small code model limits, but this is an extremely quick and low-maintenance way to get things linking in the meantime without restructuring everything else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks good. From what I can tell you've addressed @MaskRay 's concern, but we should confirm.
@@ -902,11 +902,12 @@ enum RankFlags { | |||
RF_NOT_ALLOC = 1 << 26, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ranking mechanism is a bit difficult to understand, but changing it is out of scope of this change.
Even my suggestion placing |
what do you mean by |
I think our analysis of msan track origins shows that it has a 3x .text size multiplier, and many reasonable applications have ~0.67GiB of .text. Even with Landing this |
Reading between the lines, the main motivation for this is msan builds? Regarding the point of evicting code to dynamic libraries. Yeah it's possible, we do it, but even if it is possible it's not super scalable since each service has setup what to evict in their builds. TBH this discussion sound very similar to https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU for data sections. |
I think we can give this a try. Best to mention when LLVM started to emit The code and x86-64-section-layout.s has other changes, so this PR needs a rebase. We probably should add another assembly file in |
So that when mixing small and large text, large text stays out of the way of the rest of the binary.
Place it at the beginning rather than at the end so that with
--no-rosegment
, the large text and rodata share a single PT_LOAD segment.