-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite docs for std::ptr
#49767
Rewrite docs for std::ptr
#49767
Conversation
- Add links to the GNU libc docs for `memmove`, `memcpy`, and `memset`, as well as internally linking to other functions in `std::ptr` - List sources of UB for all functions. - Add example to `ptr::drop_in_place` and compares it to `ptr::read`. - Add examples which more closely mirror real world uses for the functions in `std::ptr`. Also, move the reimplementation of `mem::swap` to the examples of `ptr::read` and use a more interesting example for `copy_nonoverlapping`. - Change module level description
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @steveklabnik (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
Your PR failed on Travis. Through arcane magic we have determined that the following fragments from the build log may contain information about the problem. Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The CI build failed because I added an ignored doctest to |
This also fixes improper text wrapping.
This comment has been minimized.
This comment has been minimized.
The rendered version does not make clear that this is a link to another page, and it breaks the anchor link.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Sorry for all the failing builds, it took me a while to figure out that I had to do |
Ill try to give this a review soon, thanks! We should also probably have @rust-lang/libs look at it too |
src/libcore/intrinsics.rs
Outdated
/// The caller must ensure that `src` points to a valid sequence of type | ||
/// `T`. | ||
/// | ||
/// # Undefined Behavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the role of having both Safety
and Undefined Behavior
sections. It seems to me that the convention is to have a single section named Safety
.
[ lampam @ 21:52:55 ] (master •58 -7 +7) ~/asd/clone/rust
$ rg '#* Undefined [bB]ehavio(u)?r'
src/libcore/mem.rs
532:/// # Undefined behavior
src/liballoc/raw_vec.rs
154: /// # Undefined Behavior
171: /// # Undefined Behavior
$ rg '# Safety'
(far more results)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(granted, I often wish the # Safety
sections tended to more closely resemble your # Undefined Behavior
sections, but that is just my opinion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveklabnik mentioned that new docs should make use of Undefined Behavior
(I think upper case is correct). All these functions are trivially unsafe because they dereference raw pointers, and the Safety
section as I have it doesn't add any new information. Perhaps it could be moved to the module docs and not repeated everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I would favor a change like you just mentioned, but will also wait to see what Steve says. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Originally, I felt like these two cases were pretty distinct, but now, re-thinking about it, I'm not so sure, honestly. In some senses, UB is a subset of safety, and it feels like it's worth calling out separately, but maybe not. dunno.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused; I thought UB was the consequence of not following Safety. Is there a kind of unsafety that doesn't lead to UB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some background on my thought process:
While writing this, I took the point of view that Safety
referred to the language feature (i.e. the subset of rust you must opt into with the unsafe
keyword). Some functions (e.g. Vec::set_len
) are not obviously unsafe from looking at their signature. The Safety
section is probably a good place to discuss this.
In discussing why a function is unsafe
, one must communicate what invariants must be maintained to avoid UB. Most of the standard library uses the Safety
heading to communicate this. I am now of the opinion that there's never really a reason to have both of these top-level headings (as I do now) since the information is so closely related.
However, for functions whose unsafety is obvious because they take a raw pointer and have names like read
, write
, and copy
, it seemed clearer to name the section discussing its invariants Undefined Behavior
. RawVec::from_raw_parts
also uses this convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@scottmcm "safety" is "memory safety", there are kinds of UB that are not memory safety related.
src/libcore/ptr.rs
Outdated
/// | ||
/// # Undefined Behavior | ||
/// | ||
/// `write` can trigger undefined behavior if any of the following conditions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am bothered by the language "can trigger undefined behavior," since for all intents and purposes, there's no difference from saying that it is undefined behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops. I thought I changed this for all functions to "Behavior is undefined if any of the following conditions are violated:", but managed to miss write{,_unaligned,_bytes}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than one tiny nit, the overall formatting and structure, etc, is awesome here! Once someone from libs signs off, and this gets tweaked, i'm happy to
src/libcore/ptr.rs
Outdated
@@ -10,7 +10,7 @@ | |||
|
|||
// FIXME: talk about offset, copy_memory, copy_nonoverlapping_memory | |||
|
|||
//! Raw, unsafe pointers, `*const T`, and `*mut T`. | |||
//! Manually manage memory through raw, unsafe pointers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would drop the "unsafe" bit here and just use "through raw pointers"
- Remove redundant "unsafe" from module description. - Add a missing `Safety` heading to `read_unaligned`. - Remove weasel words in `Undefined Behavior` description for `write{,_unaligned,_bytes}`.
Ping from triage! The reviewer asked a signoff from @rust-lang/libs, can someone check this PR? |
Ping from triage @rust-lang/libs! Can someone check if the docs here are correct? |
I'd personally prefer to fold the safety/undefined behavior sections together and also ensure it's worded something along the lines of "if violated these conditions will cause the function to be unsafe ..." instead of "this function is only unsafe if at least one of these conditions is violated ..." in the sense that these are descriptions of safety but not necessarily 100% exhaustive |
Non-`Copy` types should not be in volatile memory.
Ping from triage @steveklabnik! The author addressed the comments from the libs team. |
@bors: r+ rollup thanks! |
📌 Commit 827251e has been approved by |
…bnik Rewrite docs for `std::ptr` This PR attempts to resolve rust-lang#29371. This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`. All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`". Additionally, each function imposes some subset of the following conditions on its arguments. * The pointer points to valid memory. * The pointer points to initialized memory. * The pointer is properly aligned. These requirements are discussed in the "Undefined Behavior" section along with the consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough? Some issues which still need to be addressed before this is merged: - [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct? - [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers? - [ ] The new example for `drop_in_place` is a lackluster. - [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference? - [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"? - [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types. Should I just link to {read,write} and say that the same issues with non-`Copy` types apply? - [x] `write_volatile` doesn't link back to `read_volatile`. - [ ] Update docs for the unstable [`swap_nonoverlapping`](rust-lang#42818) - [ ] Update docs for the unstable [unsafe pointer methods RFC](rust-lang/rfcs#1966) Looking forward to your feedback. r? @steveklabnik
…bnik Rewrite docs for `std::ptr` This PR attempts to resolve rust-lang#29371. This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`. All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`". Additionally, each function imposes some subset of the following conditions on its arguments. * The pointer points to valid memory. * The pointer points to initialized memory. * The pointer is properly aligned. These requirements are discussed in the "Undefined Behavior" section along with the consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough? Some issues which still need to be addressed before this is merged: - [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct? - [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers? - [ ] The new example for `drop_in_place` is a lackluster. - [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference? - [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"? - [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types. Should I just link to {read,write} and say that the same issues with non-`Copy` types apply? - [x] `write_volatile` doesn't link back to `read_volatile`. - [ ] Update docs for the unstable [`swap_nonoverlapping`](rust-lang#42818) - [ ] Update docs for the unstable [unsafe pointer methods RFC](rust-lang/rfcs#1966) Looking forward to your feedback. r? @steveklabnik
Rollup of 11 pull requests Successful merges: - #49767 (Rewrite docs for `std::ptr`) - #50399 (save-analysis: handle aliasing imports a bit more nicely) - #50594 (Update the man page with additional --print options) - #50613 (Migrate the toolstate update bot to rust-highfive) - #50632 (Add minification process) - #50685 (ci: Add Dockerfile for dist-sparc64-linux) - #50691 (rustdoc: Add support for pub(restricted)) - #50712 (Improve eager type resolution error message) - #50720 (Add “Examples” section header in f32/f64 doc comments.) - #50733 (Hyperlink DOI against preferred resolver) - #50745 (Uncapitalize "You") Failed merges:
My concerns weren't addressed, and I don't think this documentation should have landed as-is. |
Ugh, why did GitHub mark the comments as resolved? |
Yeah, this isn't ready to go yet. I'm guessing @steveklabnik saw @alexcrichton's comment and not @gankro's concerns. I'm thinking of ways to address the concerns; it will likely involve refactoring common invariants to the module-level docs. The module docs will then link to the nomicon for a more in-depth discussion on uninitialized memory and pointer aliasing. Stuff which is specific to each function will remain in a list in the function level docs. Unfortunately, I am currently devoting all of my free time to studying for Code Jam Round 2. I will, in all likelihood, be eliminated this weekend and resume work on this. In the meantime, let me know if you need anything. |
@steveklabnik could you revert this PR, since the author is also fine with not landing it? |
There was [some confusion](rust-lang#49767 (comment)) and I accidentally merged a PR that wasn't ready.
…isdreavus Revert rust-lang#49767 There was [some confusion](rust-lang#49767 (comment)) and I accidentally merged a PR that wasn't ready.
Rollup of 10 pull requests Successful merges: - #50387 (Remove leftover tab in libtest outputs) - #50553 (Add Option::xor method) - #50610 (Improve format string errors) - #50649 (Tweak `nearest_common_ancestor()`.) - #50790 (Fix grammar documentation wrt Unicode identifiers) - #50791 (Fix null exclusions in grammar docs) - #50806 (Add `bless` x.py subcommand for easy ui test replacement) - #50818 (Speed up `opt_normalize_projection_type`) - #50837 (Revert #49767) - #50839 (Make sure people know the book is free oline) Failed merges:
@gankro, @steveklabnik I've pushed a new commit which adds a module-level definition of what a "valid" pointer is. The list of invariants for each function now links back to this definition. I put the newly built docs in a gh-pages branch on my fork so that they can be more easily reviewed. I'll keep this up to date as I push changes. I also removed the erroneous requirement that memory must be initialized if it is to be read from. Instead there is a module-level discussion of how one cannot make any assumptions about the value of uninitialized memory. The example asserts that, even after calling Also, do I need to open a new PR since this accidentally got merged and reverted? Thanks! |
What one can and can't do with uninitialized memory is a bit up in the air, but it might very well be much more restrictive than what the current wording implies, so I'd prefer to err on the side of caution in these docs. I don't have specific wording to suggest but here are some things that one might get from the current text that might (likely IMO) not be true when we finally get Unsafe Code Guidelines on this matter:
Note that most of these aren't even true of LLVM's |
Yes please! |
@rkruppe I was under the impression that this already was rigorously defined behavior and that I was merely too dumb to find it :) Since defining these semantics is out of scope for this PR, I think it's better to just not mention uninitialized memory at all. I'm assuming whatever choice gets made will permit loads and stores, which is all that is required by the functions in
Edit: here it is |
New PR opened. |
This PR attempts to resolve #29371.
Rendered
This is a fairly major rewrite of the
std::ptr
docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions instd::ptr
.All functions in
std::ptr
(with the exception ofptr::eq
) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of typeT
".Additionally, each function imposes some subset of the following conditions on its arguments.
These requirements are discussed in the "Undefined Behavior" section along with the consequences of using functions that perform bitwise copies without requiring
T: Copy
. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough?Some issues which still need to be addressed before this is merged:
drop_in_place
is equivalent to callingread
and discarding the value. Is this correct?write_bytes
andswap_nonoverlapping
require properly aligned pointers?drop_in_place
is a lackluster.valid
memory is? Or should is that the job of the reference? Should we link to the reference?T
"?{read,write}_volatile
with non-Copy
types. Should I just link to {read,write} and say that the same issues with non-Copy
types apply?write_volatile
doesn't link back toread_volatile
.swap_nonoverlapping
Looking forward to your feedback.
r? @steveklabnik