Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unix] Potential performance regression in Directory.EnumerateFiles #41739

Closed
adamsitnik opened this issue Sep 2, 2020 · 12 comments · Fixed by #41817
Closed

[Unix] Potential performance regression in Directory.EnumerateFiles #41739

adamsitnik opened this issue Sep 2, 2020 · 12 comments · Fixed by #41817
Labels
area-System.IO tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Milestone

Comments

@adamsitnik
Copy link
Member

From the data I have, it looks like Directory.EnumerateFiles might have very recently regressed on Unix systems. I've used the word "might" because I am not 100% sure - most of the Unix samples confirm it, but two don't. The ones that don't used the older version of 5.0 (see the table below). Since the bot has not detected it yet, I suspect that this is a regression that has been introduced very recently.

@DrewScoggins I've tried to concatenate URL for full historical data of this benchmark but I've failed. Could you please provide such a link?

Repro

git clone https://github.com/dotnet/performance.git
python3 ./performance/scripts/benchmarks_ci.py -f netcoreapp3.1 netcoreapp5.0 --filter 'System.IO.Tests.Perf_Directory.EnumerateFiles'

System.IO.Tests.Perf_Directory.EnumerateFiles

Result Base Diff Ratio Alloc Delta Modality Operating System Bit Processor Name Base V Diff V
Faster 3252970.00 2991578.75 1.09 +0 Windows 10.0.19041.388 X64 AMD Ryzen 9 3900X 3.1.6 5.0.20.41714
Faster 3472508.59 3184986.25 1.09 +0 Windows 10.0.18363.959 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.40203
Faster 3497412.50 3198568.75 1.09 -5 Windows 10.0.18363.959 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.40416
Same 4347106.25 4279670.31 1.02 +0 Windows 10.0.19041.450 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.40416
Same 4130054.69 3936661.72 1.05 +0 Windows 10.0.19041.450 X64 Intel Core i7-6700 CPU 3.40GHz (Skylake) 3.1.6 5.0.20.40416
Faster 4641105.47 4299395.31 1.08 +0 Windows 10.0.19042 X64 Intel Core i7-7700 CPU 3.60GHz (Kaby Lake) 3.1.6 5.0.20.40416
Faster 4712802.08 4414621.09 1.07 +0 Windows 10.0.19041.450 X64 Intel Core i7-8650U CPU 1.90GHz (Kaby Lake R) 3.1.6 5.0.20.41714
Faster 5317144.90 4536003.28 1.17 +9 ubuntu 18.04 X64 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.40203
Slower 3566250.86 18230817.71 0.20 +40 manjaro X64 Intel Core i7-4771 CPU 3.50GHz (Haswell) 3.1.6 5.0.20.41714
Slower 5551243.31 27547527.63 0.20 +63 pop 20.04 X64 Intel Core i7-6600U CPU 2.60GHz (Skylake) 3.1.6 5.0.20.41714
Slower 4829429.69 18987115.38 0.25 +187 alpine 3.11 X64 Intel Core i7-7700 CPU 3.60GHz (Kaby Lake) 3.1.6 5.0.20.41714
Faster 5108726.04 4192573.44 1.22 +0 ubuntu 18.04 X64 Intel Core i7-7700 CPU 3.60GHz (Kaby Lake) 3.1.6 5.0.20.40416
Slower 13388079.00 36065247.50 0.37 +144 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 12747264.15 37910838.00 0.34 +144 ubuntu 16.04 Arm64 Unknown processor 3.1.7 5.0.20.41714
Slower 13230105.95 37753812.25 0.35 +144 ubuntu 16.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Slower 1440853.33 25140212.50 0.57 +144 bimodal ubuntu 18.04 Arm64 Unknown processor 3.1.6 5.0.20.41714
Same 3361610.00 3283831.88 1.02 -3 Windows 10.0.18363.959 X86 Intel Xeon CPU E5-1650 v4 3.60GHz 3.1.6 5.0.20.41714
Same 4270568.75 4130793.75 1.03 -3 Windows 10.0.19041.450 X86 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.40416
Faster 4717593.75 4454796.88 1.06 +0 Windows 10.0.18363.1016 Arm Microsoft SQ1 3.0 GHz 3.1.6 5.0.20.40416
Slower 12141835.79 78421701.63 0.15 +118 macOS Catalina 10.15.6 X64 Intel Core i5-4278U CPU 2.60GHz (Haswell) 3.1.6 5.0.20.41714
Slower 9970166.05 66796449.00 0.15 +2723 macOS Catalina 10.15.6 X64 Intel Core i7-4870HQ CPU 2.50GHz (Haswell) 3.1.6 5.0.20.41714
Same 11711681.63 11203108.66 1.05 +26 macOS Mojave 10.14.5 X64 Intel Core i7-5557U CPU 3.10GHz (Broadwell) 3.1.6 5.0.20.40203
@adamsitnik adamsitnik added area-System.IO tenet-performance Performance related issue labels Sep 2, 2020
@adamsitnik adamsitnik added this to the 5.0.0 milestone Sep 2, 2020
@Dotnet-GitSync-Bot Dotnet-GitSync-Bot added the untriaged New issue has not been triaged by the area owner label Sep 2, 2020
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 2, 2020
@jeffhandley
Copy link
Member

jeffhandley commented Sep 2, 2020

@carlossanlop could this have been caused by #40641? Please make sure to investigate this while we still have time to consider it for RC2. Thanks!

Note: If it turns out that the fix we made demands this extra perf cost, we can close this as an acceptable perf regression.

@adamsitnik
Copy link
Member Author

@tmds might have some interesting input on this

@danmoseley
Copy link
Member

I think you found it @jeffhandley it looks like every entry now has an extra lstat ? https://github.com/dotnet/runtime/pull/40641/files#diff-9d4df56e6a88e5d7c54862af2249715cR58

Another option maybe is to reverse the bug fix and find another approach for 6.0 ?

cc @iSazonov

@carlossanlop
Copy link
Member

@tmds might have some interesting input on this

@adamsitnik where can I find his input?

@jeffhandley I'll compare the perf on Linux with and without that change. So we are ok with the perf regression if this turned out to be the root cause? @danmosemsft would you be ok with that?

@carlossanlop
Copy link
Member

Update: My change made the benchmark 4.3 times slower, I think we should revert it:

summary:
worse: 1, geomean: 4.371
total diff: 1

| Slower                                        | diff/base | Base Median (ns) | Diff Median (ns) | Modality|
| --------------------------------------------- | ---------:| ----------------:| ----------------:| --------:|
| System.IO.Tests.Perf_Directory.EnumerateFiles |      4.37 |       3996385.16 |      17467169.23 |         |

No Faster results for the provided threshold = 0.001% and noise filter = 0.3ns.

@danmoseley danmoseley added the tenet-performance-benchmarks Issue from performance benchmark label Sep 3, 2020
@jeffhandley
Copy link
Member

Good call, @carlossanlop. I was leaning toward keeping it, but those are some convincing numbers. Please proceed with reverting the fix.

@iSazonov
Copy link
Contributor

iSazonov commented Sep 3, 2020

I think we should revert it

In #40641 we concluded to merge as is and then working on refactoring and removing extra P/Invokes.
We would want to get #40641 in PowerShell. (It is not critical.)

Update: I think a consistency in the API implemented in #40641 is important and the perf regression could be simply fixed by removing new P/Invoke for Hidden on Unix-s.

@stephentoub
Copy link
Member

We should revert it for 5.0 and then revisit for 6.0. A 4-5x perf regression in such a mainline case as enumerating files is not acceptable.

@iSazonov
Copy link
Contributor

iSazonov commented Sep 3, 2020

then revisit for 6.0

I hope it will be in 5.0.1 servicing release - #40641 fixes really a bug.

@adamsitnik
Copy link
Member Author

Are we sure that reverting #40641 is the only solution?

I am not familiar with the implementation and underlying Unix APIs, but since the problem is calling lstat for every file separately to obtain readonly and hidden information is it possible to:

  • Obtain the missing information by using a different method|overload than ReadDirR in FindNextEntry and eliminate the call to lstat?

private unsafe void FindNextEntry(byte* entryBufferPtr, int bufferLength)
{
int result = Interop.Sys.ReadDirR(_directoryHandle, entryBufferPtr, bufferLength, out _entry);

  • make the Attributes implementation lazy? (load the info when asked for the first time)

  • change the pattern from getting each entry separately to getting all of them with all information at once?

  • call lstat only for non-Linux OSes (!OperatingSystem.IsLinux()) since Linux does not allow for hidden files? I assume that we would still miss the readonly information on Linux ;(

@adamsitnik
Copy link
Member Author

where can I find his input?

@carlossanlop when I tagged @tmds I was hoping to get his input, but he has not provided it yet ;)

@stephentoub
Copy link
Member

Are we sure that reverting #40641 is the only solution?

For 5.0 specifically. We need to get back to a good state. Then other fixes for the minor issue that was being addressed by the PR can be investigated: that issue does not meet the 5.0 bar.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.IO tenet-performance Performance related issue tenet-performance-benchmarks Issue from performance benchmark
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants