-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libs] add recursive variant of fs::read_dir
#69684
Comments
Oh I guess an obvious omission is I would imagine functionality as provided by |
Imo there is no obvious, right solution for recursive directory walking. Do you follow symlinks or not? Do you respect device boundaries? Should inaccessible directories communicated as errors or be silently skipped? Should directories themselves be part of the resulting entries or not? How to avoid descending into potentially huge hidden dirs (e.g. cache dirs, containing git objects)? How do you handle recursion depth (which again is a bigger issue when symlinks are involved)? For
That is not a bash built-in. but standard unix tool with multiple implementations (gnu, busybox, ...). And it has a ton of options, which also indicates that there's no one size fits all solution.
That's not the only one! For example there's ignore which implements a parallel walker which can be necessary because walking directories is IO and thus slow. In my experience naive directory walking is agonizingly slow when performed on network filesystems. Speaking of which, async getdents might become a thing at some point and then you'll want to steer users towards that instead of a slow synchronous implementation which alternates between IO and consuming results. There are some other crates too.
That's more of a a libc function. On linux it's using the getdents(2) syscall under the hood. Perhaps you were thinking of ftw(3), which does the filtering in userspace. So given all these issues maybe that's one of those cases where batteries shouldn't be included. |
I agree with @the8472, there are many ways to skin this cat, and I never expected std to provide this. |
I think this is a reasonable statement. Even the use walkdir::WalkDir;
for entry in WalkDir::new("foo") {
let entry = entry.unwrap();
println!("{}", entry.path().display());
} That's virtually identical to your proposed simple API. For more complex operations in which you wanted to set options we could provide a builder type. This would mirror the simple APIs of |
Quote BurntSushi:
Note, that walkdir, find and nftw do quite some book-keeping, so they dont provide optimal performance. Such a function would require several options with compilation time cost to be a good solution for all cases and further requires hand-tuning per OS to be optimal (the syscall wrapper cost becomes significant with amount of calls): https://github.com/romkatv/gitstatus/blob/master/docs/listdir.md If being (performance-) optimal wrt the use case is still a goal of Rust, then I would advice against this and also suggest to close this issue, as providing fine-granular configurability for applications at scale will lead to bloat and hard maintenance (compatibility hazard of C++). |
Regarding performance, io_uring will gain getdents support in 5.17. This could even benefit single-threaded, synchronous code by queuing up requests that will be fulfilled in the background while user code processes the results. |
Motivation
As part of the
fs
module there exist various recursive and non-recursive operations.create_dir
andcreate_dir_all
.remove_dir
andremove_dir_all
. But to read the contents of a directory there currently only existsread_dir
, but no recursive counterpart:fs::create_dir
fs::create_dir_all
fs::remove_dir
fs::remove_dir_all
fs::read_dir
I noticed the omission of this method when trying to read out a directory recursively, and discovered this was the only directory operation that doesn't have a recursive counterpart.
Usage overview
I'd imagine
fs::read_dir_all
would be used in much the same way asfs::read_dir
:Implementation overview
Roughly the API additions to
std::fs
would look like this:Potential drawbacks
I remember reading something about the
readdir(3)
syscall, and a file filtering as part of the kernel. But I can't seem to find any reference to that anymore. I also thought this had been referenced in a prior RFC, but that too I cannot find.But for argument's sake, even if there was a variant for
readdir
variant that took a filter I'd argue that sincestd::fs::read_dir
doesn't provide filtering, neither shouldstd::fs_read_dir_all
. And if we'd want a variant that does provide filtering, it would not be a clean counterpart tofs::read_dir
, so it would warrant introducing under a new name.These functions should be able to co-exist with each other, much the same way
fs::read_to_string
andio::Read::read_to_string
co-exist (both useful, but is a simplified version of the other).Similar functionality in other languages
std::filesystem::recursive_directory_iterator
since C++ 17.find
command generally available.$ find .
will recursively print out all directory contents.Conclusion
I think
fs::read_dir_all
makes for a straight forward addition to the stdlib. People expect this functionality to exist by default, and the shape and name of this function doesn't seem controversial. Even if filtering supersets of this functionality could possibly exist, they would likely be a different shape, and could co-exist with this function. Thanks!The text was updated successfully, but these errors were encountered: