-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
available_parallelism: Gracefully handle zero value cfs_period_us #104493
Conversation
There seem to be some scenarios where `cpu.cfs_period_us` can contain `0` This causes a panic when calling `std::thread::available_parallelism()` as is done so from binaries built by `cargo test`, which was how the issue was discovered. I don't feel like `0` is a good value for `cpu.cfs_period_us`, but I also don't think applications should panic if this value is seen. This case is handled by other projects which read this information: - num_cpus: https://github.com/seanmonstar/num_cpus/blob/e437b9d9083d717692e35d917de8674a7987dd06/src/linux.rs#L207-L210 - ninja: https://github.com/ninja-build/ninja/pull/2174/files - dotnet: https://github.com/dotnet/runtime/blob/c4341d45acca3ea662cd8d71e7d71094450dd045/src/coreclr/pal/src/misc/cgroup.cpp#L481-L483 Before this change, this panic could be seen in environments setup as described above: ``` $ RUST_BACKTRACE=1 cargo test Finished test [unoptimized + debuginfo] target(s) in 3.55s Running unittests src/main.rs (target/debug/deps/x-9a42e145aca2934d) thread 'main' panicked at 'attempt to divide by zero', library/std/src/sys/unix/thread.rs:546:70 stack backtrace: 0: rust_begin_unwind 1: core::panicking::panic_fmt 2: core::panicking::panic 3: std::sys::unix::thread::cgroups::quota 4: std::sys::unix::thread::available_parallelism 5: std::thread::available_parallelism 6: test::helpers::concurrency::get_concurrency 7: test::console::run_tests_console 8: test::test_main 9: test::test_main_static 10: x::main at ./src/main.rs:1:1 11: core::ops::function::FnOnce::call_once at /tmp/rust-1.64-1.64.0-1/library/core/src/ops/function.rs:248:5 note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. error: test failed, to rerun pass '--bin local-rabmq-amqpprox' ``` I've tested this change in an environment which has the bad setup and rebuilding the test executable against a fixed std library fixes the panic.
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @m-ou-se (or someone else) soon. Please see the contribution instructions for more information. |
Hey! It looks like you've submitted a new PR for the library teams! If this PR contains changes to any Examples of
|
https://docs.kernel.org/scheduler/sched-bwc.html#management
So the value 0 shouldn't occur. Are you on some kind of cgroup emulation? Perhaps you should report that it can be 0 but shouldn't to whatever is causing that in the first place. |
It does occur in practice so it is worth guarding against this possibility. It makes |
That doesn't answer the question what's causing it. Whether it's a kernel bug, some 3rd party software or whatever. It's good to have a documented rootcause. |
@bors r+ |
…iaskrgr Rollup of 8 pull requests Successful merges: - rust-lang#104402 (Move `ReentrantMutex` to `std::sync`) - rust-lang#104493 (available_parallelism: Gracefully handle zero value cfs_period_us) - rust-lang#105359 (Make sentinel value configurable in `library/std/src/sys_common/thread_local_key.rs`) - rust-lang#105497 (Clarify `catch_unwind` docs about panic hooks) - rust-lang#105570 (Properly calculate best failure in macro matching) - rust-lang#105702 (Format only modified files) - rust-lang#105998 (adjust message on non-unwinding panic) - rust-lang#106161 (Iterator::find: link to Iterator::position in docs for discoverability) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
There seem to be some scenarios where the cgroup cpu quota field
cpu.cfs_period_us
can contain0
. This field is used to determine the "amount" of parallelism suggested by the functionstd::thread::available_parallelism
A zero value of this field cause a panic when
available_parallelism()
is invoked. This issue was detected by the call from binaries built bycargo test
. I really don't feel like0
is a good value forcpu.cfs_period_us
, but I also don't think applications should panic if this value is seen.This panic started happening with rust 1.64.0.
This case is gracefully handled by other projects which read this information: num_cpus, ninja, dotnet
Before this change, running
cargo test
in environments configured as described above would trigger this panic:I've tested this change in an environment which has the bad (questionable?) setup and rebuilding the test executable against a fixed std library fixes the panic.