-
-
Notifications
You must be signed in to change notification settings - Fork 844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8.1.0 find the targets very slow in a large directory when using fzf. 8.0.0 works well. #599
Comments
Thank you for reporting this.
Note that this might scan much more than your home directory due to
As far as I understand this correctly, there could be two reasons for that: (1) The whole As for (2), note that the output order of
Maybe you want to use
Again. The output order is not deterministic.
Are you sure that nothing else changed? Could you please run Save 'fd version 8.x.0' as
I personally cannot see any performance differences between the two versions. |
Thank you for your help. Do you see my two update statements along with two GIFs at the end of my last post? I guarantee that I didn't change anything but the different versions. I will demonstrate what I did below: (my OS is macOS 10.15.4)
The procedure I described above is complete. Thank you. |
I certainly believe you that this is all it takes to reproduce the error. The problem is that (1) I don't have your home folder (2) I don't have MacOS and (3) I don't really want to debug things in This is why I proposed a direct benchmark of # brew install fd v8.x.0
cp $(which fd) /tmp/fd-v8.x.0 |
Ok, there seems to be a drastic regression. But you are running the command from |
Sure. Still running under my home folder. Just a moment please. :) |
8.0.0 is faster (not sure whether it is accurate because I was watching YouTube while hyperfine was running :P). And actually based on my observation when running fzf, both of the scanning speed for 8.0.0 and 8.1.0 are very fast. When we run fzf, we can see a number that shows how many entries have been scanned. The number increases very fast for both versions. My concern is that 8.1.0 can get the target immediately but 8.0.0 not. Also, I ran this test under some different directories. 8.1.0 is slower. Screenshots attached below. |
If the overall speed of both searches is roughly the same, I guess it's mostly a matter of luck because the order of the results is non-deterministic, as I have mentioned. Note that the search within However, I have no idea why 8.1 would consistently show the Unrelated to this ticket, but: if the search really takes 160 seconds, you might consider dropping either
Oh, wow. That definitely seems like a significant drop in performance. It would be great if you could help me figure out what is going on, because I can definitely not reproduce this locally:
The first thing we should check: do both commands actually return the same results? Could you please do the following: fd-v8.0.0 --hidden --follow --exclude .git | sort > /tmp/output_8.0
fd-v8.1.0 --hidden --follow --exclude .git | sort > /tmp/output_8.1
diff /tmp/output_8.0 /tmp/output_8.1 If this does not show any differences, we would need to find the commit that caused the regression. Because I can not really see any obvious things in the CHANGELOG that could have caused this. The most dramatic change is the disabling of jemalloc in fd 8.1, which could definitely lead to a performance regression like this. But if I'm not mistaking, Homebrew had already disabled jemalloc via the patch in https://raw.githubusercontent.com/Homebrew/homebrew-core/d874b06712ec20efd86f2fbf20e97aa2f24e9f5b/Formula/fd.rb |
I think I might have found the reason why this happens. I actually upgraded the This would explain perfectly well why you previously saw low-depth results first (as the search was breadth first) while it takes almost until the end of the search now (with depth first traversal). The overall search time might also be influenced by this, as the scheduling of different threads might be affected by this (?).
1.03 x faster means: 3% faster. Not very conclusive. But the runs in the other two directories show a much larger gap. |
Wow, so it is! Thank you. Is it possible for fd to provide an option to choose depth-first traversal or breadth-first traversal? For fzf, breadth-first traversal is much more efficient, and in the large directory (if some directory is very deep) depth-first traversal is terrible.
Yes. Especially in a small directory, the gap is large. |
I first need to understand all the implications of this change. This comes as a surprise to me as well. I certainly agree that breadth-first sounds more useful for the typical |
Roger that! Thanks. :P |
Breadth-first is almost always better for finding files: https://twitter.com/tavianator/status/1144718120852103168 I am curious to try the test case that motivated the |
@tavianator Thank you for the feedback I have performed a few benchmarks on my machine and haven't found any case where the new version of Also, note that If someone can come up with a reproducible test case where the new version is definitely slower, I'm happy to revisit this ticket. |
Yeah it mainly matters for interactive use, where you care about time to first result. |
In terms of a reproducible benchmark, maybe:
But weirdly, v8.1.1 just seems to be slower overall:
On a Mac, I got these I'm not familiar with fd/rg/ignore internals and how the parallelisation works, but is iterative deepening DFS (perhaps with exponential deepening) a search option that could be made to work? |
Iterative deepening is particularly bad on most directory trees in my experience (bfs implements it under Exponential deepening should be better, I'll try it out. |
I implemented exponential deepening in this commit. It's about 85% slower than simple breadth-first search on the Linux kernel source tree. That's much better than the standard iterative deepening, which is about 5x slower. |
Thanks for investigating / implementing! Looks like the maximum depth for the Linux source is 9, which is unlucky in that it's just past another deepening threshold. But if we limit to depth 7, it's pretty competitive with DFS. Obviously this depends on directory tree, but I guess if fd-find is looking to implement something like this, maybe deepening to 1, 2, 4, ∞, could resolve this issue without too much of a completion time compromise.
|
Hi @sharkdp
Any updates? Thank you very much. |
As my further investigated, See the GIF below. I run Then next, I uninstalled |
that would be consistent with doing a depth-first traversal, since it probably searches |
@tmccombs Exactly. This is what sharkdp mentioned in his previous comment (#599 (comment)). ripgrep switched to DFS. Even though this switching will reduce the peak memory usage and won't affect the time of the full search, it may take a very long time for the shallow target to be searched out. And fzf needs the shallow result to be shown quickly. Any change will have pros and cons. The author of ripgrip said switching to DPS can make some cases perform better. However, it brings this side-effect to fzf. |
Hi,
I am using
fd
as the default command offzf
. After updating to the latest version, I find it is very slow to find the target in a large directory (like home).For FZF, I set
export FZF_DEFAULT_COMMAND="fd --hidden --follow --exclude .git"
. Then I runfzf
under my home directory. Next, I type a directory name which is right under home directory say.config
, and it will take very very long to find~/.config
. It shows many files under ~/Library which I do not care about. I attached a screenshot below. You can see that ~/.config won't show up even though it has scanned 1080073 entries.Before updating
fd
and I cannot remember which version it was, it works very well. When I type.config
, it will show up almost immediately. The results could be adjusted automatically based on which query I typed so that my target will show up right away. For example, when I typed.config
, it would not show me all those Library/..../config which I did not care about, and instead it would show ~/.config. Very smart.What version of
fd
are you using?fd 8.1.0
Which operating system / distribution are you on?
macOS 10.15.4 (the latest)
Thank you very much!!
Update1:
To be clearer, I put two GIF below (this time I ran fzf under home and tried to search a directory,
~/gitrepos
):1). I didn't set fzf default command, which means let fzf use the default
find
. When I started to type gitrepos, the targets appeared immediately.2). I set fzf default command to
fd --type f
. This time when I started to type gitrepos, the targets wouldn't appear until all entries were scanned.Update2:
I just now deleted the latest version of fd and re-installed version 8.0.0 (with
brew install https://raw.githubusercontent.com/Homebrew/homebrew-core/d874b06712ec20efd86f2fbf20e97aa2f24e9f5b/Formula/fd.rb
). The original performance is back. So I am sure the latest version has a bug. Thank you!The text was updated successfully, but these errors were encountered: