-
Notifications
You must be signed in to change notification settings - Fork 600
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider using LTO + PGO + Bolt #140
Comments
I did some performance experiments on my local machine. My setup:
For benchmark purposes and profile generation, I've used KqpLoad actor (https://ydb.tech/en/docs/development/load-actors-kqp) which I've run multiple times for 300 seconds each time (all other parameters are default). YDB setup - local with RAM storage as described here: https://ydb.tech/en/docs/getting_started/self_hosted/ydb_local but with my own I did the following things:
The results are the following:
Also, I've tried to apply BOLT but Additional notes regarding PGO via instrumentation. During my profile generation with instrumented |
Well, I managed to run BOLT with some "magic" options (details are here: llvm/llvm-project#61711). As expected, BOLT didn't provide a significant performance boost after PGO - but still, I see measurable improvements:
I think Propeller (an alternative approach, similar to BOLT but from Google) could bring almost the same numbers. I tried to test YDB with Propeller... But Propeller requires the latest Clang compiler from the main branch, and YDB has a bunch of compilation errors with it - and right now I have some motivation lack to fix them... Maybe, one day I will test it too :) |
Hi Alexander Zaitsev, thank you very much for sharing this excellent idea and making the initial experiments. One of our engineers have confirmed your results and working further on integration details. We will be back soon, when collect more data and understand best possible usage. |
@eivanov89 do you have updates regarding PGO? If you confirm the results and you find them useful, I suggest adding to the YDB documentation a note regarding tuning YDB with PGO. Here are the examples from other projects, how this documentation can look like:
Having this kind of information in the official documentation makes optimization opportunities more visible to the end users and maintainers. |
Hi @zamazan4ik, sorry for delay. We have some issues with our internal tools and build. Hope to solve soon though. But if fail, we will consider applying this to github build only. |
Understood. I suggest if you confirm the results above, add a note about PGO to the YDB documentation. So the users who build YDB binaries on their own will be able to estimate performance benefits from PGO on YDB and optimize their YDB builds too. |
The tests that we both have used to test PGO are too narrow, imho. We're going to try YCSB and TPC-C to check if real benchmarks benefit same manner as microbenchmarks we have used so far. |
Using LTO+PGO to optimize MySQL performance has greatly improved. On this basis, using BOLT to optimize again did not achieve performance improvement. Is this result in line with expectations? |
Hi!
YDB right now does not support building with more advanced optimization techniques like PGO and BOLT. This tooling has an increasing adoption in the community as a tool to additionally optimize programs. With this tooling, there is a huge chance to gain even more performance "for free".
Here I suggest considering an option at least to play with LTO + PGO + Bolt pipeline (or any combination of them) and test, does it give a performance to the project or not. If yes, would be awesome to have prebuilt binaries with more advanced optimization from the scratch. Also, for the users will be helpful to have the ability to tweak manually their own binaries to their own workloads with the integrated into the build scripts functionality.
Also, there are some caveats to consider like:
Links:
The text was updated successfully, but these errors were encountered: