-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
RFC: mem::black_box and mem::clobber
- Loading branch information
Showing
1 changed file
with
138 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,138 @@ | ||
- Feature Name: black_box-and-clobber | ||
- Start Date: 2018-03-12 | ||
- RFC PR: (leave this empty) | ||
- Rust Issue: (leave this empty) | ||
|
||
# Summary | ||
[summary]: #summary | ||
|
||
This RFC adds two functions to `core::mem`: `black_box` and `clobber`, which are | ||
mainly useful for writing benchmarks. | ||
|
||
# Motivation | ||
[motivation]: #motivation | ||
|
||
The `black_box` and `clobber` functions are useful for writing synthetic | ||
benchmarks where, due to the constrained nature of the benchmark, the compiler | ||
is able to perform optimizations that wouldn't otherwise trigger in practice. | ||
|
||
The implementation of these functions is backend-specific and requires inline | ||
assembly. Such that if the standard library does not provide them, the users are | ||
required to use brittle workarounds on nightly. | ||
|
||
# Guide-level explanation | ||
[guide-level-explanation]: #guide-level-explanation | ||
|
||
|
||
## `mem::black_box` | ||
|
||
The function: | ||
|
||
```rust | ||
pub fn black_box<T>(x: T) -> T; | ||
``` | ||
|
||
prevents the value `x` from being optimized away and flushes pending reads/writes | ||
to memory. It does not prevent optimizations on the expression generating the | ||
value `x` nor on the return value of the function. For | ||
example ([`rust.godbolt.org`](https://godbolt.org/g/YP2GCJ)): | ||
|
||
```rust | ||
fn foo(x: i32) -> i32{ | ||
mem::black_box(2 + x); | ||
3 | ||
} | ||
let a = foo(2); | ||
``` | ||
|
||
Here, the compiler can simplify the expression `2 + x` into `2 + 2` and then | ||
`4`, but it is not allowed to discard `4`. Instead, it must store `4` into a | ||
register even though it is not used by anything afterwards. | ||
|
||
## `mem::clobber` | ||
|
||
The function | ||
|
||
```rust | ||
pub fn clobber() -> (); | ||
``` | ||
|
||
flushes all pending writes to memory. Memory managed by block scope objects must | ||
be "escaped" with `black_box` . | ||
|
||
Using `mem::{black_box, clobber}` we can benchmark `Vec::push` as follows: | ||
|
||
```rust | ||
fn bench_vec_push_back(bench: Bencher) -> BenchResult { | ||
let n = /* large enough number */; | ||
let mut v = Vec::with_capacity(n); | ||
bench.iter(|| { | ||
// Escape the vector pointer: | ||
mem::black_box(v.as_ptr()); | ||
v.push_back(42_u8); | ||
// Flush 42 write to memory: | ||
mem::clobber(); | ||
}) | ||
} | ||
``` | ||
# Reference-level explanation | ||
[reference-level-explanation]: #reference-level-explanation | ||
|
||
* `mem::black_box(x)`: flushes all pending writes/read to memory and prevents | ||
`x` from being optimized away while still allowing optimizations on the | ||
expression that generates `x`. | ||
* `mem::clobber`: flushes all pending writes to memory. | ||
|
||
# Drawbacks | ||
[drawbacks]: #drawbacks | ||
|
||
TBD. | ||
|
||
# Rationale and alternatives | ||
[alternatives]: #alternatives | ||
|
||
An alternative design was proposed during the discussion on | ||
[rust-lang/rfcs/issues/1484](https://github.com/rust-lang/rfcs/issues/1484), in | ||
which the following two functions are provided instead: | ||
|
||
```rust | ||
#[inline(always)] | ||
pub fn value_fence<T>(x: T) -> T { | ||
let y = unsafe { (&x as *const T).read_volatile() }; | ||
std::mem::forget(x); | ||
y | ||
} | ||
|
||
#[inline(always)] | ||
pub fn evaluate_and_drop<T>(x: T) { | ||
unsafe { | ||
let mut y = std::mem::uninitialized(); | ||
std::ptr::write_volatile(&mut y as *mut T, x); | ||
drop(y); // not necessary but for clarity | ||
} | ||
} | ||
``` | ||
|
||
This approach is not pursued in this RFC because these two functions: | ||
|
||
* add overhead ([`rust.godbolt.com`](https://godbolt.org/g/aCpPfg)): `volatile` | ||
reads and stores aren't no ops, but the proposed `black_box` and `clobber` | ||
functions are. | ||
* are implementable on stable Rust: while we could add them to `std` they do not | ||
necessarily need to be there. | ||
|
||
# Prior art | ||
[prior-art]: #prior-art | ||
|
||
These two exact functions are provided in the [`Google | ||
Benchmark`](https://github.com/google/benchmark) C++ library: are called | ||
[`DoNotOptimize`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L306) | ||
(`black_box`) and | ||
[`ClobberMemory`](https://github.com/google/benchmark/blob/61497236ddc0d797a47ef612831fb6ab34dc5c9d/include/benchmark/benchmark.h#L317). | ||
The `black_box` function with slightly different semantics is provided by the `test` crate: | ||
[`test::black_box`](https://github.com/rust-lang/rust/blob/master/src/libtest/lib.rs#L1551). | ||
|
||
# Unresolved questions | ||
[unresolved]: #unresolved-questions | ||
|
||
TBD. |