Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add FromBytes/IntoBytes methods which read from/write to an io::Read/io::Write #158

Open
2 tasks
joshlf opened this issue Feb 16, 2023 · 9 comments
Open
2 tasks
Labels
compatibility-nonbreaking Changes that are (likely to be) non-breaking

Comments

@joshlf
Copy link
Member

joshlf commented Feb 16, 2023

Progress

  • Add FromBytes::read_from_io and IntoBytes::write_to_io in 0.8
  • In 0.9, perform the following renames:
    • IntoBytes::write_to -> write_to_bytes (to be consistent with FromBytes::read_from_bytes)
    • Rename FromBytes::read_from_io -> read_from and IntoBytes::write_to_io -> write_to

Original text

Crosvm has a utility function called zerocopy_from_reader:

pub fn zerocopy_from_reader<R: io::Read, T: FromBytes>(mut read: R) -> io::Result<T> {
    // Allocate on the stack via `MaybeUninit` to ensure proper alignment.
    let mut out = MaybeUninit::zeroed();

    // Safe because the pointer is valid and points to `size_of::<T>()` bytes of zeroes,
    // which is a properly initialized value for `u8`.
    let buf = unsafe { from_raw_parts_mut(out.as_mut_ptr() as *mut u8, size_of::<T>()) };
    read.read_exact(buf)?;

    // Safe because any bit pattern is considered a valid value for `T`.
    Ok(unsafe { out.assume_init() })
}

Maybe we should add something similar to FromBytes (and potentially a write analogue to AsBytes)?

@joshlf joshlf added the compatibility-nonbreaking Changes that are (likely to be) non-breaking label Aug 12, 2023
@joshlf joshlf mentioned this issue Aug 20, 2023
@joshlf
Copy link
Member Author

joshlf commented Sep 18, 2024

@jswrenn
Copy link
Collaborator

jswrenn commented Sep 18, 2024

The obvious name, IMO, for this is read_from, but that name is currently squatted by a #[doc(hidden)] and #[deprecated].

Since it's marked #[doc(hidden)], it doesn't have any SemVer obligations, but in the interest of providing a smooth transition for users, perhaps we wait a release or two before revising it?

If this is the course we take, we should update its deprecation message to something like this:

#[deprecated(since = "0.8.0", note = "renamed to `FromBytes::read_from_bytes`. Migrate immediately. This item will soon be re-used for a different routine.")]

@jswrenn
Copy link
Collaborator

jswrenn commented Sep 18, 2024

Not sure what to call the _from_prefix variant, since read_from_prefix is already used, too (except for a visible, not deprecated function).

@joshlf
Copy link
Member Author

joshlf commented Sep 18, 2024

A few considerations:

  • I don't mind read_from_io_read or similar. It's weird, but also consistent with our naming convention of read_from_<place>
  • I don't think we need prefix/suffix variants since a reader vends single bytes at a time, and so it can always give us exactly size_of::<Self>() bytes
  • We will want a read_from_with_elems for Self: ?Sized + KnownLayout
  • We may also want an analogous IntoBytes::write_to method. In that case, perhaps we should rename our existing write_to to write_to_bytes for consistency?

@jswrenn
Copy link
Collaborator

jswrenn commented Sep 18, 2024

  • I don't think we need prefix/suffix variants since a reader vends single bytes at a time, and so it can always give us exactly size_of::<Self>() bytes

There's an oddity about Read here in that it provides read_to_end so it's quite possible to distinguish between the read-all and read-prefix cases.

@joshlf
Copy link
Member Author

joshlf commented Sep 18, 2024

IMO it's fine to provide the minimal API and let users manually construct fancier use cases. All of this can be built safely using our existing primitives anyway.

@NWPlayer123
Copy link

NWPlayer123 commented Sep 19, 2024

  • I don't mind read_from_io_read or similar. It's weird, but also consistent with our naming convention of read_from_<place>

Throwing out the option of read_from_stream, since the Read trait embodies a stream of bytes that you can just pull from instead of a discrete byte slice.

@russellbanks
Copy link

It would be helpful to consider TryFromBytes here too.

For example, I have this enum that represents compression:

#[repr(u8)]
pub enum Compression {
    Stored,
    Zlib,
    BZip2,
    LZMA1,
    LZMA2,
}

Currently, I use FromRepr from strum to get an enum from the byte:

let compression = reader.read_u8()?;
header.compression = Compression::from_repr(compression)
    .ok_or_else(|| eyre!("Unexpected compression value: {compression}"))?;

Now that we have TryFromBytes, I could use that instead. However, as I can't just zero the type and write data to it like you could do when using FromBytes, it requires me to read the data into a buffer, and then use TryFromBytes:

let mut buf = [0; size_of::<Compression>()];
reader.read_exact(&mut buf)?;
Compression::try_read_from_bytes(&buf)

This is now several lines every time I want to read an enum. It's difficult to make this generic as the below function fails to compile with constant expression depends on a generic parameter (@joshlf pointed out #248 here). Using vec![] works but then it's not allocated on the stack.

fn enum_value<T: KnownLayout + TryFromBytes + Sized>(reader: &mut impl Read) -> Result<T> {
    let mut buf = [0; size_of::<T>()];
    reader.read_exact(&mut buf)?;
    T::try_read_from_bytes(&buf)
}

I've also considered a macro, which does work, but it's not syntactically as nice as it could be.

macro_rules! enum_value {
    ($reader:expr, $ty:ty) => {{
        let mut buf = [0; size_of::<$ty>()];
        $reader.read_exact(&mut buf)?;
        <$ty>::try_read_from_bytes(&buf)
    }};
}

I've considered using try_transmute!() but it still requires me to read the correct size of the data and ultimately, it would be the easiest for me if there was a way to interpret bytes straight from the reader.

@joshlf
Copy link
Member Author

joshlf commented Nov 1, 2024

More prior art: The ReadBytesExt and WriteBytesExt traits from the byteorder crate as mentioned in #438 (comment).

@joshlf joshlf mentioned this issue Nov 4, 2024
8 tasks
@joshlf joshlf changed the title Add FromBytes constructor which reads from an io::Read Add FromBytes/IntoBytes methods which read from/write to an io::Read/io::Write Nov 4, 2024
jswrenn added a commit that referenced this issue Nov 4, 2024
jswrenn added a commit that referenced this issue Nov 5, 2024
jswrenn added a commit that referenced this issue Nov 5, 2024
jswrenn added a commit that referenced this issue Nov 5, 2024
google-pr-creation-bot pushed a commit to google-pr-creation-bot/zerocopy that referenced this issue Nov 5, 2024
jswrenn added a commit that referenced this issue Nov 7, 2024
Makes progress on #158.

Co-authored-by: Joshua Liebow-Feeser <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Nov 8, 2024
Makes progress on #158.

Co-authored-by: Joshua Liebow-Feeser <[email protected]>
joshlf added a commit that referenced this issue Nov 11, 2024
Makes progress on #158.

Co-authored-by: Joshua Liebow-Feeser <[email protected]>
gherrit-pr-id: I9253d6be7407d8d8679ee355e4e71cd9b15b9ff7
github-merge-queue bot pushed a commit that referenced this issue Nov 12, 2024
…2046)

Makes progress on #158.


gherrit-pr-id: I9253d6be7407d8d8679ee355e4e71cd9b15b9ff7

Co-authored-by: Jack Wrenn <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compatibility-nonbreaking Changes that are (likely to be) non-breaking
Projects
None yet
Development

No branches or pull requests

4 participants