Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCons: Default num_jobs to max CPUs minus 1 if not specified #63087

Merged
merged 1 commit into from
Jul 17, 2022

Conversation

akien-mga
Copy link
Member

@akien-mga akien-mga commented Jul 16, 2022

This doesn't change the behavior when --jobs/-j is specified as a
command-line argument or in SCONSFLAGS.

The SCons hack used to know if num_jobs was set by the user is derived
from the MongoDB setup.

We use os.sched_getaffinity to respect potential limits set by the user
or system administrator.

We use os.cpu_count() for portability (available since Python 3.4).

With 4 CPUs or less, we use the max. With more than 4 we use max - 1 to
preserve some bandwidth for the user's other programs.

Edit: godot-cpp equivalent: godotengine/godot-cpp#788


Redo of #9972, 5 years later I've warmed up to the idea (CC @capnm).

It's up for discussion though. I think some modern buildsystems are now doing this automatically so users are used to it just using all the available capacity. Notably I saw this recent example of someone mistakenly building with one core. It's still possible to specify -j1 when you want to compile sequentially (useful to check a build error without the noise of other jobs).

This change would enable us to simplify the compiling documentation (e.g. here) a bit so we don't have to drag along system-specific instructions to get the number of CPU cores. And I noticed that GitHub CI runners for macOS actually have 3-core CPUs so removing our --jobs=2 should speed up CI significantly: https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources

Edit: CI build logs on this PR confirm 3 CPUs for macOS and 2 CPUs for Windows/Linux runners.

@akien-mga akien-mga added enhancement discussion topic:buildsystem cherrypick:3.x Considered for cherry-picking into a future 3.x release labels Jul 16, 2022
@akien-mga akien-mga added this to the 4.0 milestone Jul 16, 2022
@akien-mga akien-mga requested a review from a team as a code owner July 16, 2022 23:17
@akien-mga
Copy link
Member Author

We use os.sched_getaffinity to respect potential limits set by the user
or system administrator.

Oh well I had missed this note in the docs: https://docs.python.org/3/library/os.html#os.sched_getaffinity

These functions control how a process is allocated CPU time by the operating system. They are only available on some Unix platforms.

Not available on macOS and Windows... so back to os.cpu_count().

@akien-mga akien-mga force-pushed the scons-num_jobs-default-max branch 3 times, most recently from 66db73a to b23913d Compare July 16, 2022 23:39
@akien-mga akien-mga force-pushed the scons-num_jobs-default-max branch from b23913d to fb857c2 Compare July 16, 2022 23:57
@mhilbrunner
Copy link
Member

FWIW, I agree with this change and it is what I expect out of the box.

@akien-mga
Copy link
Member Author

If any Python gurus want to double check what's the absolute best portable way to query CPU count, feel free. I saw multiprocess.cpu_count() (which apparently calls os.cpu_count() so I used that one directly) and psutil.cpu_count() (used by MongoDB), but it's not in the stdlib so I'm not too keen on that one.

@LauraWebdev
Copy link

I think this could be a great proposal, though I'd limit the default to num_of_cores - 1 so it's not clogging the entire bandwidth.

@akien-mga
Copy link
Member Author

I think this could be a great proposal, though I'd limit the default to num_of_cores - 1 so it's not clogging the entire bandwidth.

Yeah that's what I tend to do myself, I have 8 cores on the laptop but use 7 so it doesn't interfere too much with my always on Firefox.

Got a couple more echoes along this line:
https://twitter.com/mwesterdahl76/status/1548571946786308096
https://twitter.com/simplex_fx/status/1548572362685452288

And one suggesting using half the cores but that's pretty arbitrary and unexpected IMO, I prefer "all or nothing" defaults:
https://twitter.com/MeezPower/status/1548583843954819073

I think for low CPU situations I might stick to max CPU, as that's where it makes the biggest difference (between e.g. 1 and 2 cores), and that matches the kind of config users have on server VMs like CI runners. So maybe up to 4 cores we use max, above we use max - 1.

@akien-mga akien-mga force-pushed the scons-num_jobs-default-max branch from fb857c2 to eed306f Compare July 17, 2022 10:02
@akien-mga akien-mga changed the title SCons: Default num_jobs to max available CPUs if not specified SCons: Default num_jobs to max CPUs minus 1 if not specified Jul 17, 2022
This doesn't change the behavior when `--jobs`/`-j` is specified as a
command-line argument or in `SCONSFLAGS`.

The SCons hack used to know if `num_jobs` was set by the user is derived
from the MongoDB setup.

We use `os.cpu_count()` for portability (available since Python 3.4).

With 4 CPUs or less, we use the max. With more than 4 we use max - 1 to
preserve some bandwidth for the user's other programs.
@akien-mga akien-mga force-pushed the scons-num_jobs-default-max branch from eed306f to ea21122 Compare July 17, 2022 10:38
@BenjaminNavarro
Copy link
Contributor

N-1 is a good default even though I tend to go to N or N+1 sometimes. To me the biggest problem is not really the CPU usage but the RAM one. Each job you add can take a decent amount or RAM so if you have a high core system with low memory available you will end up swapping, which is much worse than slowness induced by the increased CPU usage. I don't know what's the RAM usage like while compiling Godot though, but that might be something to look at.

@bruvzg
Copy link
Member

bruvzg commented Jul 17, 2022

To me the biggest problem is not really the CPU usage but the RAM one.

It's gonna be hard to estimate RAM usage, it depends on specific tool chain used and build settings a lot, especially for LTO enabled builds (GCC LTO sometimes use 10 times more RAM than clang LTO).

@akien-mga akien-mga merged commit 3ae6159 into godotengine:master Jul 17, 2022
@akien-mga akien-mga deleted the scons-num_jobs-default-max branch July 17, 2022 13:43
@akien-mga
Copy link
Member Author

Cherry-picked for 3.5.

@akien-mga akien-mga removed the cherrypick:3.x Considered for cherry-picking into a future 3.x release label Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants