-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CStr::from_bytes_with_nul #30614
CStr::from_bytes_with_nul #30614
Conversation
(rust_highfive has picked a reviewer for you, use r? to override) |
641ac9e
to
c5d9806
Compare
cc @rust-lang/libs |
@nagisa aha! I do feel like an explicit error type is probably better. A EDIT: On second thought, I can't imagine a case when you wouldn't know beforehand whether your byte slice is terminated or not, and we already have |
c9c189e
to
9c46d43
Compare
cc #20475, another issue that I believe this would handle. I'm gonna tag this with T-libs so it comes up during normal triage. I'm not so sure about this API personally, could you elaborate on some of the use cases you have in mind? |
Well, I'd break it into...
|
This is actually somewhat intentional, because the ideal representation of I do agree that literal construction is pretty difficult, but that's more along the lines of needs-to-be-a-compiler-plugin than anything else I think to ensure that the nul byte is always at the end. In theory the |
Whether the
My view would be that as long as you can reconstruct the
Well, if we assume a rustc with a more useful const fn story, we could implement it as a macro like so: macro_rules! cstr {
($s:expr) => {
unsafe { ::std::ffi::CStr::from_bytes_unchecked(concat!($s, "\0").as_bytes()) }
}
} The same could be implemented via Another thought is that we could make It's true that Also, another case where this API is useful is when creating C strings on the stack in a |
I think one part I'm worried about is that both of these assertions may fail assert_eq!(CStr::from_bytes(slice).unwrap().to_bytes(), slice);
assert_eq!(CStr::from_bytes_unchecked(slice).to_bytes(), slice); The first line will truncate Out of curiosity, what's the use case of creating a |
Ah, do you mean the naming is inconsistent vs
It's an unsafe method that runs the risk of memory unsafety if you provide it data that doesn't uphold the invariants, that's pretty much the point.
Mm, FFI is the main place to be using |
Oh I'm not worried about naming inconsistency, it just seems that if you construct a |
Well, if we break it apart...
In the current implementation, this will always succeed, if slice has a trailing null byte and no interior bytes. If neither of those are true, the
Again, this is unsafe so you have to assert you're maintaining the invariants, or you will indeed get a different slice back, read past allocated memory, etc. |
Ah yes that's correct, I think I misread the implementation of Ok, thinking about this some more, I think my confused stemmed from perhaps misconstruing this PR as sort of reinterpreting a Along those lines, however, I wonder if perhaps a better name would be |
Basically this, and it still also applies if in the future the type is changed to not literally store its length alongside the pointer. Though I'd argue that
Unfortunate naming, but I agree that it would be more consistent/symmetrical. I mean damn, |
Ah well, we'll discuss this at the next libs meeting regardless and I'm sure we'll have some feedback as well :) |
The libs team discussed this during triage yesterday and the conclusion was that we're ok with these methods so long as they have the |
@alexcrichton like this? ab136bf623e291a51067c70ec6c8a17cab0ac8e8 |
/// ``` | ||
#[unstable(feature = "cstr_from_bytes", reason = "recently added", issue = "0")] | ||
pub fn from_bytes_with_nul<'a>(bytes: &'a [u8]) -> Option<&'a CStr> { | ||
if bytes.is_empty() || memchr::memchr(0, &bytes) != Some(bytes.len() - 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't bytes.is_empty()
redundant?
EDIT: I see, it's needed to avoid an underflow in bytes.len() - 1
/// Creates a C string wrapper from a byte slice. | ||
/// | ||
/// This function will cast the provided `bytes` to a `CStr` wrapper after | ||
/// ensuring that it is null terminated but does not contain any interior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"and does not contain"?
/// # } | ||
/// ``` | ||
#[unstable(feature = "cstr_from_bytes", reason = "recently added", issue = "0")] | ||
pub fn from_bytes_with_nul<'a>(bytes: &'a [u8]) -> Option<&'a CStr> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does lifetime elision not work here?
Sorry that took so long... |
I'm a bit iffy on the return type, but `Result` would also be a bit weird... The two error cases are `Unterminated` and `InteriorNul(usize)`. I considered `from_chars(&[c_char])` but this meshes better with `as_bytes_with_nul()` and Rust code in general. Should there also be a corresponding unsafe `from_bytes_unchecked` variant?
I'm a bit iffy on the return type, but
Result
would also be a bit weird... The two error cases areUnterminated
andInteriorNul(usize)
.I considered
from_chars(&[c_char])
but this meshes better withas_bytes_with_nul()
and Rust code in general.Should there also be a corresponding unsafe
from_bytes_unchecked
variant?