-
Notifications
You must be signed in to change notification settings - Fork 12.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BOLT] About optionally emitting the .bolt.org.text segment #85796
Comments
Could someone provide information on how .bolt.org.text is generated? Perhaps I could develop a codesize optimization option for Bolt (similar to clang -Os) to support current and future codesize optimizations for everyone. |
Hi. .bolt.org.text is original text section that is not touched by the bolt and currently could not be stripped. There is an old patch could be found on fabricator that eliminates it but for not it is considered to be too complicated and not stable enough to be in BOLTs trunk. |
@yota9 |
@killerloura Yes, usually both texts are preserved. As for size optimisation there are 2 possibilities:
Removing the old text is "extra work" for BOLT which is not implemented. Plus it is considered "less secure", as we might unintentionally skip some pointers/jump tables/etc and jump on the old text address at runtime (also I've seen such a behaviour maybe only once and fixed the bug, but still). |
@yota9 |
Hi, do you mean this section is unsafe to strip, even manually? Could the issue with stripping it be related to #56738? Ack on the concern it's less safe, but what if we accept that risk? I'm running into this issue causing the BOLTed binary to not fit in the fixed memory space allocated for the executable, and unfortunately lite mode isn't implemented for aarch64, so I'm a bit stuck. |
The problem with strip is the new location of PHDR AFAIR. There is a workaround option -use-gnu-stack. But I've never tried io combine it with stripping main text, you may try it. As for lite and aarch64 - I'm not a user of lite mode, but I'm surprised to hear that. What problems do you encourage with it? As for "official" method - I believe there is no current way to reject old text section despite the methods I've already mention that could be tried.. |
@yota9 Hi, if I enable the -use-old-text option, it aligns the code at a 2MB boundary. Although my optimizations significantly reduce the codesize, during this process it generates many new functions, which causes its size to always be greater than the old size when calculating. My question is whether reducing the alignment size (even down to 4 bytes) would also lead to instability issues? |
@YuanSha0 Hi! I would not expect instability, everything should work just fine. By default BOLT uses large alignment for huge pages support, but I think for use-old-text we shall disable this by default. |
@yota9 lite mode with aarch64:
I can submit a separate issue if you think it'd be valuable. |
@sdt16 What version of BOLT are you using? Could you please tell me what function is located at /home/bolt/llvm-project/bolt/include/bolt/Core/MCPlusBuilder.h:1609 ? |
@yota9 |
@YuanSha0 I suspect the reason is the same as with .org.text - BOLT just doesn't support removing old sections currently, so you would need to do the same stuff as with .org.text you already did. I might be wrong though, if so Meta guys are welcomed to answer this question, I'm not in to DWARF thematics deeply to be honest :) |
@yota9 I processed the old .eh_frame section in the same way as handling the old .text section with the -use-old-text option. Luckily, the program appears to be functioning properly. |
@yota9 I have completed a substantial portion of my CODESIZE optimization work, but I have noticed that BOLT seems to offer particularly limited support for jump tables. In the analyzeMemoryAt function, I have found that BOLT does not currently support analysis of jump tables within the .TEXT section nor does it provide support for jump tables in the aarch64 architecture. Is this an aspect of BOLT's current functionality, or does such support exist outside the main branch? |
@YuanSha0 AFAICT The aarch64 jump table support is currently very limited. I've tried to study this question a few years ago and AFAIR the main problem is that we can't tell jump table size since there are no markers of it in the binary. |
@yota9 Even if I cannot perform a comprehensive analysis of the jump table, is it possible to obtain partial information about it? Specifically, could we identify which function possesses the jump table and label all instructions belonging to this particular jump table? Disabling the handling of jump table instructions during optimization could be a temporarily viable solution. |
@YuanSha0 Theoretically yes, although I'm not sure how well it is implemented now. You can see BFs methods like hasJumpTables for example and iterate over the instructions with getJumpTable, since they're marked with annotations.. Hope it would help, since I didn't look at this thematics for a long time :) |
@yota9 Thank you again! |
@yota9 Hello, I have a question. I’ve noticed that Bolt now supports Linux kernel optimization. After optimizing the Linux kernel, will it still retain the original text segment? Thank you! |
@yota9 Thank you. |
Hello, I have a question. I’ve noticed that Bolt now supports Linux kernel optimization. After optimizing the Linux kernel, will it still retain the original text segment? Thank you! @maksfb |
Yes, as of right now, the original |
@YuanSha0 are you working on a publicly available branch? Would love to collaborate on code-size optimization. |
Apologies for the delayed response. I've been caught up with some other work. Currently, the project I'm working on is in a private phase and not publicly available, so I'm unable to collaborate on it at this time. However, I appreciate your interest and will keep you in mind for potential future collaborations once the project is at a stage where it can be shared. Thanks for understanding! |
I have completed some work on code size optimization, but I noticed that BOLT emits a segment named .bolt.org.text, which accounts for nearly half of the code size in the executable. This renders my optimizations ineffective since the optimized executable is even larger than the unoptimized binary. In this issue, I found that it is not straightforward to remove .bolt.org.text using bolt flags. I would like to know what optimizations are associated with it? Can I sacrifice some (or all) performance optimizations to remove this segment? Will this help in achieving actual code size reduction in BOLT's optimization work? Thank you.
The text was updated successfully, but these errors were encountered: