-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
specialize io::copy to use copy_file_range, splice or sendfile #75272
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @KodrAus (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
c9d1e81
to
ac58849
Compare
library/std/src/io/copy.rs
Outdated
fn copy_specialization() -> Result<()> { | ||
let path = crate::env::temp_dir(); | ||
let source_path = path.join("copy-spec.source"); | ||
let sink_path = path.join("copy-spec.sink"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does std have some policy about tests polluting the filesystem? there doesn't seem to be any utility for temporary files and libc doesn't even have a wrapper for memfd_create
:(
☔ The latest upstream changes (presumably #75715) made this pull request unmergeable. Please resolve the merge conflicts. |
8258196
to
f867287
Compare
☔ The latest upstream changes (presumably #76047) made this pull request unmergeable. Please resolve the merge conflicts. |
☔ The latest upstream changes (presumably #76265) made this pull request unmergeable. Please resolve the merge conflicts. |
Thanks for your patience on this so far @the8472! Since this is quite a large PR it'll take me a while to dig through it all. It's tricky to benchmark IO code, but I would be interested to see roughly what the impact of this is so we can justify the new machinery a bit. |
☔ The latest upstream changes (presumably #75428) made this pull request unmergeable. Please resolve the merge conflicts. |
Edit: posted more recent benchmarks in initial comment All benchmarks copy 128KiB per bencher iteration. Larger files will probably achieve higher speeds. While benchmarking I have noticed that results involving pipes can be quite erratic since currently the only form of pipes exposed in the public API is for child process stdio, which means the performance characteristics depend on the process on the other end. So I'll go ahead and change the behavior to do syscall bashing to figure out which one works on a pair of file descriptors, with this preference ranking: copy_file_range > sendfile > splice > read+write |
@KodrAus updated and shaved off some syscalls. You can find benchmarks on top. |
Thanks @the8472! I'll give this a thorough review now 👀 |
Thanks for your patience on this one @the8472! I just had to confirm that we're ok landing features that depend on specialization again (we had a block on it for a while, but with |
Ok, but if you had mentioned that I could have pointed you to other specialization PRs that have landed since. And I am still anticipating the review itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this @the8472!
I've left a few comments to start. I think we should refactor this into the sys
module, to also avoid having to leak the CopyResult
types that really only make sense for this specific implementation of io::copy
.
I haven't dug into the specialized implementations just yet, or the changes to unix::fs
.
library/std/src/io/copy.rs
Outdated
|
||
fn fd_to_meta<T: AsRawFd>(fd: &T) -> FdMeta { | ||
let fd = fd.as_raw_fd(); | ||
let file: ManuallyDrop<File> = ManuallyDrop::new(unsafe { File::from_raw_fd(fd) }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate we don't seem to have a better way to get the metadata for a file descriptor.
Is it actually always valid for us to convert an arbitrary file descriptor into a File
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a nicer way to call fstat/statx
. It's equivalent to but more efficient and reliable than File::open("/proc/self/fd/<fd>")?.metadata()?
library/std/src/io/copy.rs
Outdated
match result { | ||
CopyResult::Ended(Ok(bytes_copied)) => return Ok(bytes_copied + written), | ||
CopyResult::Ended(err) => return err, | ||
CopyResult::Fallback(bytes) => written += bytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected for us to fallthrough to the next if
here and then to the generic copy at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's why the enum valuse is called Fallback
. If that is not obvious it might need a better name.
library/std/src/io/copy.rs
Outdated
match result { | ||
CopyResult::Ended(Ok(bytes_copied)) => return Ok(bytes_copied + written), | ||
CopyResult::Ended(err) => return err, | ||
CopyResult::Fallback(bytes) => written += bytes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it expected for us to fallthrough to the next if
here and then to the generic copy at the end?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
d2bea7b
to
a7e6535
Compare
Should be ready for another review pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I've done another pass over the code. I think the implementation is good, but I think we should try limit the visibility of the internals needed by the specialized copy as much as possible. Ideally, I think the module structure could look something like this:
pub mod sys {
pub mod unix {
mod copy {
#[cfg(any(target_os = "linux", target_os = "android"))]
pub fn copy_fs(from: &Path, to: &Path) -> io::Result<u64> {}
pub fn copy<R, W>(reader: R, writer: W) -> io::Result<usize> {}
}
pub mod fs {
#[cfg(any(target_os = "linux", target_os = "android"))]
pub use super::copy::copy_fs as copy;
}
pub use self::copy::copy;
}
}
pub mod io {
mod copy {
pub fn copy<R, W>(reader: R, writer: W) -> io::Result<usize> {
cfg_if::cfg_if! {
if #[cfg(unix)] {
crate::sys::unix::copy(reader, writer)
} else {
copy_fallback(reader, writer)
}
}
}
pub(crate) fn copy_fallback<R, W>(reader: R, writer: W) -> io::Result<usize> {}
}
pub use self::copy::copy;
}
That way none of the CopyResult
or other internal enums need to be visible in the rest of the crate. What do you think?
@@ -315,6 +315,7 @@ | |||
#![feature(toowned_clone_into)] | |||
#![feature(total_cmp)] | |||
#![feature(trace_macros)] | |||
#![feature(try_blocks)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised this wasn't here already... As far as I know the feature is somewhat blocked, but it looks like we do use it elsewhere in the repo so I'm ok with adding it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I only used it for cleanup in tests. But that was before I learned that other IO tests don't clean up the files they create either. So removing it and adding to the clutter instead would also be an option.
|
||
/// The general read-write-loop implementation of | ||
/// `io::copy` that is used when specializations are not available or not applicable. | ||
pub(crate) fn generic_copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a more descriptive name for this might be copy_fallback
since we only use it when specialized implementations aren't available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if fallback is the right word. From the perspective of the specializations it indeed is the fallback. But the specializations only apply in some narrow scenarios. In the more general case of arbitrary Read
/Write
pairs this is the default strategy.
The issue is that there are more implementations of |
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]> (cherry picked from commit 300e4b1)
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]>
dracut-cpio is a minimal cpio archive creation utility written in Rust. It provides support for a minimal set of features needed to create performant and space-efficient initramfs archives: - "newc" archive format only - reproducible; inode numbers, uid/gid and mtime can be explicitly set - data segment copy-on-write reflinks + using Rust io::copy()'s native copy_file_range() support[1] + optional archive data segment alignment for optimal reflink use[2] - hardlink support - comprehensive tests asserting GNU cpio binary output compatibility 1. Rust io::copy() copy_file_range() rust-lang/rust#75272 2. Data segment alignment We're bending the newc spec a bit to inject zeros after the file path to provide data segment alignment. These zeros are accounted for in the namesize, but some applications may only expect a single zero-terminator (and 4 byte alignment). GNU cpio and Linux initramfs handle this fine as long as PATH_MAX isn't exceeded. Signed-off-by: David Disseldorp <[email protected]> (cherry picked from commit a9c6704)
I stumbled across the fact that we no longer need coreos/openat-ext@c377a54 because rust-lang/rust#75272 landed just a few months after! While we're here, slightly clean up the fd dance to make things a bit safer using `BorrowedFd`. It's interesting to note here that with io-lifetimes we could add a method to the glib crate to borrow the underlying fd safely.
It's no longer needed since rust-lang/rust#75272
It seems to be upstream now, see: rust-lang/rust#75272 However for rsop splice is not used anymore, see 'test_splice' script.
It seems to be upstream now, see: rust-lang/rust#75272 However for rsop splice is not used anymore, see 'test_splice' script.
It is now done transparently by the Rust stdlib: rust-lang/rust#75272
Fixes #74426.
Also covers #60689 but only as an optimization instead of an official API.
The specialization only covers std-owned structs so it should avoid the problems with #71091
Currently linux-only but it should be generalizable to other unix systems that have sendfile/sosplice and similar.
There is a bit of optimization potential around the syscall count. Right now it may end up doing more syscalls than the naive copy loop when doing short (<8KiB) copies between file descriptors.
The test case executes the following:
0-1
stat
calls to identify the source file type. 0 if the type can be inferred from the struct from which the FD was extracted𝖬
write
to drain theBufReader
/BufWriter
wrappers. only happen when buffers are present. 𝖬 ≾ number of wrappers present. If there is a write buffer it may absorb the read buffer contents first so only result in a single write. Vectored writes would also be an option but that would require more invasive changes toBufWriter
.𝖭
copy_file_range
/splice
/sendfile
until file size, EOF or the byte limit fromTake
is reached. This should generally be much more efficient than the read-write loop and also have other benefits such as DMA offload or extent sharing.Benchmarks