Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job number calculation should take into account the amount of swap space #72

Open
vgurevich opened this issue Feb 7, 2025 · 7 comments
Labels
enhancement New feature or request

Comments

@vgurevich
Copy link
Contributor

The builds became a lot slower on my system after #38 , because now it calculates max jobs to be 3 on my system with 4 CPUs and 16GB of memory.

This would've been the right thing to do if my system lacked swap space (indeed, the build would not even complete), but I did add the swap space and it would be nice to continue using all 4 CPUs by default (it definitely works when building the standard SDE).

The proposal is to read not only to read MemTotal from /proc/meminfo, but also SwapTotal and change the heuristic.

Also, it would probably be better to look for MemTotal explicitly instead of assuming that it is always the first line in /proc/meminfo

@jafingerhut
Copy link
Contributor

The code calculating this is pretty easy to find in the PR you link. Tweak it all you like. If someone files an issue because the build fails, you get to fix it :-)

@jafingerhut
Copy link
Contributor

Warning: as you have personally experienced earlier, if you tweak the formula to use too many parallel jobs, the build gets dramatically slower, or fails because of OOM killer in kernel. Not fun to debug.

@jafingerhut
Copy link
Contributor

jafingerhut commented Feb 7, 2025

Sorry, one more answer that I just remembered now. With the current code, you can always specify --jobs <number>, explicitly, and it will not do any calculation of parallel jobs for you. If you do that, it is on you to pick a good number, and you should not file an issue if the build fails because of using too many parallel jobs.

Personally, I strongly recommend that whatever automated default methods are used to pick the number of parallel jobs for build, that it is far better if they sometimes err by using 1 less parallel job than the maximum possible, vs. if they sometimes lead to performance death by too much swapping, or failure due to OOM killer in the kernel. For expert users who want 1 more parallel job than the default, the--jobs <number> parameter is right there, available for them to shoot themselves in the foot, or not if they are careful enough.

@fruffy fruffy added the enhancement New feature or request label Feb 7, 2025
@vgurevich
Copy link
Contributor Author

Yes, I know about --jobs <number>. The point is that if the command worked in a certain way (performance-wise or CPU-utilization-wise) before, it should not suddenly start working slower :) and force me to add this parameter every time I start it from now on.

I'll make the changes -- just wanted to make sure you are OK with them.

@jafingerhut
Copy link
Contributor

I am OK with them if they work, and not OK with them if they cause OOM killer or massive swapping performance degradation for any combination of # of CPUs and amount of RAM and other possible system configurations that anyone might try in the future, which is NOT only 4 vCPU + 16GB RAM + amount of swap you personally configure on your system.

You use the word force in an unusual way. You are not forced to use that option (it's in the word "option").

@vgurevich
Copy link
Contributor Author

Another option is to remove the unity builds or to tune them so that they use a little less memory. At least on AWS the ratio of RAM to CPUs are 4GB/cpu, which is cutting it really close to what the tool assumes.

@jafingerhut
Copy link
Contributor

Removing the unity builds would probably increase the build time by some noticeable amount of time, but probably not a large percent for the overall install. It should significantly reduce the amount of memory required per parallel job that is run.

Before my changes in #38, as far as I know p4studio made no assumptions about the memory required -- it just used nproc as the number of parallel jobs if you did not use --jobs <number>, and gave you horrible performance or failure if your memory was too low relative to number of CPUs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants