diff --git a/text/0000-bench-utils.md b/text/0000-bench-utils.md new file mode 100644 index 00000000000..69f64a7b4d2 --- /dev/null +++ b/text/0000-bench-utils.md @@ -0,0 +1,138 @@ +- Feature Name: black_box-and-clobber +- Start Date: 2018-03-12 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary +[summary]: #summary + +This RFC adds two functions to `core::mem`: `black_box` and `clobber`, which are +mainly useful for writing benchmarks. + +# Motivation +[motivation]: #motivation + +The `black_box` and `clobber` functions are useful for writing synthetic +benchmarks where, due to the constrained nature of the benchmark, the compiler +is able to perform optimizations that wouldn't otherwise trigger in practice. + +The implementation of these functions is backend-specific and requires inline +assembly. Such that if the standard library does not provide them, the users are +required to use brittle workarounds on nightly. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + + +## `mem::black_box` + +The function: + +```rust +pub fn black_box(x: T) -> T; +``` + +prevents the value `x` from being optimized away and flushes pending reads/writes +to memory. It does not prevent optimizations on the expression generating the +value `x` nor on the return value of the function. For +example ([`rust.godbolt.org`](https://godbolt.org/g/YP2GCJ)): + +```rust +fn foo(x: i32) -> i32{ + mem::black_box(2 + x); + 3 +} +let a = foo(2); +``` + +Here, the compiler can simplify the expression `2 + x` into `2 + 2` and then +`4`, but it is not allowed to discard `4`. Instead, it must store `4` into a +register even though it is not used by anything afterwards. + +## `mem::clobber` + +The function + +```rust +pub fn clobber() -> (); +``` + +flushes all pending writes to memory. Memory managed by block scope objects must +be "escaped" with `black_box` . + +Using `mem::{black_box, clobber}` we can benchmark `Vec::push` as follows: + +```rust +fn bench_vec_push_back(bench: Bencher) -> BenchResult { + let n = /* large enough number */; + let mut v = Vec::with_capacity(n); + bench.iter(|| { + // Escape the vector pointer: + mem::black_box(v.as_ptr()); + v.push_back(42_u8); + // Flush 42 write to memory: + mem::clobber(); + }) +} +``` +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +* `mem::black_box(x)`: flushes all pending writes/read to memory and prevents + `x` from being optimized away while still allowing optimizations on the + expression that generates `x`. +* `mem::clobber`: flushes all pending writes to memory. + +# Drawbacks +[drawbacks]: #drawbacks + +TBD. + +# Rationale and alternatives +[alternatives]: #alternatives + +An alternative design was proposed during the discussion on +[rust-lang/rfcs/issues/1484](https://github.com/rust-lang/rfcs/issues/1484), in +which the following two functions are provided instead: + +```rust +#[inline(always)] +pub fn value_fence(x: T) -> T { + let y = unsafe { (&x as *const T).read_volatile() }; + std::mem::forget(x); + y +} + +#[inline(always)] +pub fn evaluate_and_drop(x: T) { + unsafe { + let mut y = std::mem::uninitialized(); + std::ptr::write_volatile(&mut y as *mut T, x); + drop(y); // not necessary but for clarity + } +} +``` + +This approach is not pursued in this RFC because these two functions: + +* add overhead ([`rust.godbolt.com`](https://godbolt.org/g/aCpPfg)): `volatile` + reads and stores aren't no ops, but the proposed `black_box` and `clobber` + functions are. +* are implementable on stable Rust: while we could add them to `std` they do not + necessarily need to be there. + +# Prior art +[prior-art]: #prior-art + +These two exact functions are provided in the [`Google +Benchmark`](https://github.com/google/benchmark) C++ library: are called +[`DoNotOptimize`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L306) +(`black_box`) and +[`ClobberMemory`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L317). +The `black_box` function with slightly different semantics is provided by the `test` crate: +[`test::black_box`](https://github.com/rust-lang/rust/blob/master/src/libtest/lib.rs#L1551). + +# Unresolved questions +[unresolved]: #unresolved-questions + +TBD.