-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RyuJIT/ARM32] low performance compared to amd64 investigation (System.Tests.Perf_Char.Char_ToUpper test). #13049
Comments
u_toupper is ICU API. the following is the code going from TextInfo to ICU. https://github.com/dotnet/coreclr/blob/27790ab2dbee25d2e9c6fc41863aa7e983552a3f/src/System.Private.CoreLib/shared/System/Globalization/TextInfo.Unix.cs#L51
Is it possible you can measure the perf of the ICU u_toupper()/u_tolower(), just to isolate if the issue is really the ICU or we have other issue in the managed side. CC @adamsitnik |
I did additional tests for ARM32 in order to get some u_toupper()/u_tolower() calls performance.
Accordingly to my logs, in this test u_toupper() was called 0 times. |
From the table it looks you are using ASCII characters ('A' and 'a'). we have optimization for the ASCII characters and will not call ICU in such cases. you can try it with non ASCII characters and you should get u_toupper get called. could you paste the perf comparison data data in ASCII and non ASCII scenarios? Sorry I don't have handy VMs to do it. @janvorli I recall you mentioned before it is not right to compare the perf of 64-bit against 32-bit. is this right? |
Sure, here is non ASCII test results for ARM32:
But, the point is the same - for tests with "zh*" cultureName, I see huge performance drop and this is not connected to ICU u_toupper() calls time for sure, plus, ASCII tests with same performance drop on "zh*" cultureName tests don't call u_toupper() at all. |
That is very interesting. The only place that can depends on the culture operation when using Ascii character is But this code should be executed once per culture. Where is the code of the test |
RAW logs are quite big (189,6 MB for ARM32 and 125,7MB for AMD64), you can download packed files here: |
@viewizard I am asking where I can find the C# code of this test? |
Thanks @viewizard for the pointer. Could you please try update the test to something like the following and remeasure? I am just trying to validate the theory that the one time initialization in the text info is the place which showing the perf difference. [BenchmarkCategory(Categories.CoreFX)]
public class Perf_Char
{
private static CultureInfo s_en_US = new CultureInfo("en-US");
private static CultureInfo s_zh_Hans = new CultureInfo("zh-Hans");
public static IEnumerable<object[]> Char_ChangeCase_MemberData()
{
// force one time initialization
char a = char.ToUpper('a', s_en_US);
char b = char.ToUpper('a', s_zh_Hans);
yield return new object[] { 'A', s_en_US }; // ASCII upper case
yield return new object[] { 'a', s_en_US }; // ASCII lower case
yield return new object[] { '\u0130', s_en_US }; // non-ASCII, English
yield return new object[] { '\u4F60', s_zh_Hans }; // non-ASCII, Chinese
}
[Benchmark]
[ArgumentsSource(nameof(Char_ChangeCase_MemberData))]
public char Char_ToLower(char c, CultureInfo cultureName) => char.ToLower(c, cultureName); // the argument is called "cultureName" instead of "culture" to keep benchmark ID in BenchView, do NOT rename it
[Benchmark]
[ArgumentsSource(nameof(Char_ChangeCase_MemberData))]
public char Char_ToUpper(char c, CultureInfo cultureName)=> char.ToUpper(c, cultureName);
}
} |
@tarekgh, here is new log for ARM32 with code you provided above:
Just in case, here i new AMD64 log with code you provided above:
|
@viewizard thanks a lot for providing the data. This looks ICU is somehow contributing to the difference as you can see even en-US culture showing a big difference too in case of non ASCII character. I believe next step is to get a micro benchmark numbers to see the profiling internal methods and interops costs. |
Due to lack of recent activity, this issue has been marked as a candidate for backlog cleanup. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will undo this process. This process is part of our issue cleanup automation. |
Initial thread was started by @alpencolt https://github.com/dotnet/coreclr/issues/23520
Initial performance test results generated by @alpencolt
https://gist.github.com/alpencolt/0580af0be86e49bb9d89508dabcd8615
I investigate "System.Tests.Perf_Char.Char_ToUpper" test performance drop for arm32, compared to amd64. First 3 tests looks fine, but on fourth we have huge performance drop.
I logged disassembly for both, amd64 and arm32 (Char_ToUpper_amd.txt
Char_ToUpper_arm.txt), but didn't find any issues in asm, that may drop performance so much.
I find out, that the drop by somehow connected to CultureInfo:
On both (amd64 and arm32) systems installed Ubuntu 16.04 with same libICU version:
I see, libICU usage in test, but I don't see any CultureInfo-libICU connection in this test related code, all what I see is u_toupper() calls without any initialization during test execution.
Looks like I am stuck. Could someone route me, please?
The text was updated successfully, but these errors were encountered: