-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking issue for integer types conversion to and from byte arrays #52963
Comments
I think this feature is ready for stabilization: @rfcbot fcp merge This is very soon after landing in Nightly in its current form, but the same functionality already went through FCP for stabilization in #49792 after a couple months in Nightly. The new API contains methods identical to the previous API (only renamed), plus new methods that implement what was previously documented as a typical usage of the previous API. So there isn’t really anything new here, I expect this final period to be a formality except for possible last-minute naming bikeshed. (The methods landed with consistent abbreviation, another option that was considered was fully expansion with for example |
Team member @SimonSapin has proposed to merge this. The next step is review by the rest of the tagged teams: Concerns:
Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
@rfcbot concern missing_impls I just noticed that we are still missing half the implementation for this. #51919 did just add the changes for signed integers, but not for unsigned. Compare: https://doc.rust-lang.org/nightly/std/primitive.u8.html#method.to_bytes cc @tbu- |
Same as rust-lang#51919 did for signed integers. Tracking issue: rust-lang#52963
I keep reading it as "not equal". |
@shepmaster I agree that’s a real negative point. The other naming pattern that was considered (consistent full expansion) is rather verbose: |
`{to,from}_{ne,le,be}_bytes` for unsigned integer types Same as rust-lang#51919 did for signed integers. Tracking issue: rust-lang#52963
@SimonSapin: Yeah, seems to be all in order now. @rfcbot resolved missing_impls |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
Ok, this is the second (different) misinterpretation of @rust-lang/libs, any thoughts? |
Host ( If the name is being changed it should preferably be to something that will resolve all ambigouties, not change some ambigouties for different ones. Due to the abbriviation it is already expected that users will have to look up the function in the docs the first time they're using it. Because of this, less idiomatic alternatives might be OK. In these situations there are also arguments for the idiosyncratic alternative to be better, as it makes it difficult to assume things, forcing users to look it up in the docs. What about:
|
"Host" makes the wrong kind of sense in cross-compilation context, since this API is using the target machine’s endianness with |
Yes this was my exact concern. That's why i suggested to change it to something other than |
If there's confusion about the short names then I'd personally err on the side of more wordy names and expand out |
aww yis. Excited for this one. It makes some clean looking code. let magic_bytes = header.magic.to_ne_bytes();
if magic_bytes != ['h', 'p', 'k', 'r'] { |
@kallisti5 Did you mean to write byte character literals instead of chars? Assuming they're meant to be byte literal characters you can go even further and do it like this ;) let magic_bytes = header.magic.to_ne_bytes();
if magic_bytes != *b"hpkr" { |
This is behind the `i128_blob` feature. Blobs are stored as 16 byte big-endian values, with their most significant bit flipped. This is so that sorting, comparison, etc all work properly, even with negative numbers. This also allows the representation to be stable across different computers. It's possible that the `FromSql` implementation should handle the case that the real value is stored in an integer. I didn't do this, but would be willing to make the change. I don't think we should store them this way though, since I don't think users would be able to sort/compare them sanely. Support for `u128` is not implemented, as comparison with i128 values would work strangely. This also is consistent with `u64` not being allowed, not that I think that would be reason enough on it's own. The `byteorder` crate is used if this feature is flipped, as it's quite small and implements things more or less optimally. If/when `i128::{to,from}_be_bytes` gets stabilized (rust-lang/rust#52963), we should probably use that instead.
Not sure if it's too late, but is there a reason the In terms of performance a slice version would only have a runtime cost if used with an actual slice, since it's I ran into this today and it was a bit annoying. |
By taking in an array, you are statically guaranteed that there's enough data to make the transformation — you don't have to even allow for the error case.
With #![feature(int_to_from_bytes, try_from)]
use std::convert::TryInto;
fn example(bytes: &[u8]) -> i32 {
let bytes: &[_; 4] = bytes.try_into().expect("Wrong number of bytes");
i32::from_be_bytes(*bytes)
} |
Yeah, I agree, but it seems that a slice is just a superset of the abilities of an array. While I agree it's cleaner and makes more sense in a standalone implementation, I don't know of many situations where I'm dealing with something I need to convert into a number did not come from a bigger vector or slice. At least for me I don't ever see using this functionality without using a wrapper of the sorts of the snippit you posted. I guess I'm not sure how often you are dealing with packed data and are working with fixed sized arrays as the source of this data is almost always network or file streams. The only place I work with fixed sized arrays when working with files/network is when I'm casting to a structure, and at that point I'm just going to directly cast it to the u64 and skip the middle step. The I have a library https://github.com/gamozolabs/safecast which allows me to safely cast or copy things with runtime checks if needed. I pretty much must use this library in all of my Rust projects as almost everything I do is working with network/file/system level packed data. To me this just happens to be a very sensitive issue as operating on binary data from safe Rust is incredibly clunky, and is the last major issue I have personally with the language. While crates like No hard feelings, just trying to chime in about the usability of this API from my use cases (kernel development / fuzzer development). Many of my peers (other security researchers) who also work on developing code to parse/mutate/extract information from binary packed data have had similar gripes and ultimately a few have refused to use the language for these purposes as it's not feasible for codebases that largely operate on raw binary data. I likely will propose an RFC along the lines of my -B |
As shepmaster suggested, what does an API taking a slice do if the input is too short? Too long? Does it panic, or ignore extra items, or do we change the return type to a In my opinion the main goal of this API is to replace some unsafe calls to |
I'm not familiar with exactly what |
|
Right, I meant to say, I'm not sure what makes the compiler decide when to follow the hint and when not to. I thought I'd seen an issue with |
This is behind the `i128_blob` feature. Blobs are stored as 16 byte big-endian values, with their most significant bit flipped. This is so that sorting, comparison, etc all work properly, even with negative numbers. This also allows the representation to be stable across different computers. It's possible that the `FromSql` implementation should handle the case that the real value is stored in an integer. I didn't do this, but would be willing to make the change. I don't think we should store them this way though, since I don't think users would be able to sort/compare them sanely. Support for `u128` is not implemented, as comparison with i128 values would work strangely. This also is consistent with `u64` not being allowed, not that I think that would be reason enough on it's own. The `byteorder` crate is used if this feature is flipped, as it's quite small and implements things more or less optimally. If/when `i128::{to,from}_be_bytes` gets stabilized (rust-lang/rust#52963), we should probably use that instead.
FCP to merge has completed a while ago, but there has been a number of additional comments since. However I believe not enough reasons to change the API. If you disagree, now is (past) the time to say so. Here is a stabilization PR: #56207 |
Note, that this was the same with the API for Not that I’ve have any preference either way. |
Stabilize the int_to_from_bytes feature Fixes rust-lang#52963 FCP to merge completed: rust-lang#52963 (comment)
#56216 will hopefully make it easier to use these APIs with slices. |
A previous incarnation of this API that went though FCP in #49792, but stabilization was reverted while still in Nightly in favor of this API.
Implemented in #51919, conversions between all integer types and appropriately-sized arrays of
u8
, using the specified endianness:ne
,le
, andbe
mean native-endian, little-endian, and big-endian respectively. Native-endian means the target platform’s endianness. Those conversions are implemented as literallytransmute
and nothing else (but safe). The other conversions use swap or do not swap the byte order depending on the target’s endianness.The text was updated successfully, but these errors were encountered: