Isolating JITDbl2Ulng helper changes #86175

khushal1996 · 2023-05-12T20:36:06Z

Draft PR for testing purposes. No need for review at this time.

Isolating the helper function changes to make sure that the helper function is working fine. This is w.r.t. the draft PR #84384

This PR optimize the following cases:

Case	Previous Code	Optimized Instruction
float -> ulong	CORINFO_HELP_DBL2ULNG Helper	vcvttss2usi

public static UInt64 FloatToULong(float val)
{
    return (UInt64)val;
}

Assembly before optimization

G_M22196_IG01:              ;; offset=0000H
       4883EC28             sub      rsp, 40
       C5F877               vzeroupper 
						;; size=7 bbWeight=1 PerfScore 1.25
G_M22196_IG02:              ;; offset=0007H
       62F17E085AC0         vcvtss2sd xmm0, xmm0
       E87E57815E           call     CORINFO_HELP_DBL2ULNG
       90                   nop      
						;; size=12 bbWeight=1 PerfScore 5.25
G_M22196_IG03:              ;; offset=0013H
       4883C428             add      rsp, 40
       C3                   ret      
						;; size=5 bbWeight=1 PerfScore 1.25

Assembly afteroptimization

G_M22196_IG01:              ;; offset=0000H
       C5F877               vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00
G_M22196_IG02:              ;; offset=0003H
       62F1FE0878C0         vcvttss2usi rax, xmm0
						;; size=6 bbWeight=1 PerfScore 6.00
G_M22196_IG03:              ;; offset=0009H
       C3                   ret

Case	Previous Code	Optimized Instruction
double -> ulong	CORINFO_HELP_DBL2ULNG Helper	vcvttsd2usi

public static UInt64 DoubleToULong(double val)
{
    return (UInt64)val;
}

Assembly before optimization

G_M30068_IG01:              ;; offset=0000H
       4883EC28             sub      rsp, 40
       C5F877               vzeroupper 
						;; size=7 bbWeight=1 PerfScore 1.25
G_M30068_IG02:              ;; offset=0007H
       E874577F5E           call     CORINFO_HELP_DBL2ULNG
       90                   nop      
						;; size=6 bbWeight=1 PerfScore 1.25
G_M30068_IG03:              ;; offset=000DH
       4883C428             add      rsp, 40
       C3                   ret

Assembly afteroptimization

G_M30068_IG01:              ;; offset=0000H
       C5F877               vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00
G_M30068_IG02:              ;; offset=0003H
       62F1FF0878C0         vcvttsd2usi rax, xmm0
						;; size=6 bbWeight=1 PerfScore 5.00
G_M30068_IG03:              ;; offset=0009H
       C3                   ret

Case	Previous Code	Optimized Instruction
ulong -> double	vcvtsi2sd	vcvtusi2sd

public static double UIntToDouble(UInt64 val)
{
    return (double)val;
}

Assembly before optimization

G_M33997_IG01:              ;; offset=0000H
       C5F877               vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00
G_M33997_IG02:              ;; offset=0003H
       62F17C0857C0         vxorps   xmm0, xmm0
       62F1FF082AC1         vcvtsi2sd  xmm0, rcx
       4885C9               test     rcx, rcx
       7D0A                 jge      SHORT G_M33997_IG03
       62F1FF08580502000000 vaddsd   xmm0, qword ptr [reloc @RWD00]
						;; size=27 bbWeight=1 PerfScore 12.58
G_M33997_IG03:              ;; offset=001EH
       C3                   ret

Assembly afteroptimization

G_M33997_IG01:              ;; offset=0000H
       C5F877               vzeroupper 
						;; size=3 bbWeight=1 PerfScore 1.00
G_M33997_IG02:              ;; offset=0003H
       62F1FF087BC1         vcvtusi2sd xmm0, rcx
						;; size=6 bbWeight=1 PerfScore 4.00
G_M33997_IG03:              ;; offset=0009H
       C3                   ret

…vtsd2usi uses ulong.max_value to show FPE for negative, NAN and ulong_max + 1 values.

…architecture. This is because we have changed the JITDbl2Ulng helper function to mimic the new IEEE compliant AVX512 instruction vcvtsd2usi. In the process, we needed to update the library test case because the default Floating Point Error (FPE) value for the new instruction is different from the default MSVC FPE value i.e. 0.

…not changing the library test case but the API to make sure NaN cases are handled.

ghost · 2023-05-12T20:36:17Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Draft PR for testing purposes. No need for review at this time.

Author:	khushal1996
Assignees:	-
Labels:	`area-CodeGen-coreclr`, `community-contribution`
Milestone:	-

…id handling edge cases (-1,0) separately inside the helper.

…ulong/uint

…ad dword and not qword for float to ulong

… a special handling for vcvttss2usi64 to make sure we read only dword instead of qword for float to ulong conversion

khushal1996 added 4 commits May 9, 2023 16:22

fixing the JITDbl2Ulng helper function. The new AVX512 instruction vc…

7bae37a

…vtsd2usi uses ulong.max_value to show FPE for negative, NAN and ulong_max + 1 values.

Fixing the JITDbl2Ulng helper function. Also making sure that we are …

fbc134d

…not changing the library test case but the API to make sure NaN cases are handled.

reverting jitformat

293e84d

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 12, 2023

ghost added the community-contribution Indicates that the PR has been added by a community member label May 12, 2023

khushal1996 added 7 commits May 15, 2023 16:31

Adding a truncate function to the Dbl2Ulng helper to make sure we avo…

6d14c22

…id handling edge cases (-1,0) separately inside the helper.

Adding code to handle vectorized conversion for float/double to/from …

d977447

…ulong/uint

reverting changes for float to ulong

0845905

enabling float to ulong conversion

451780e

Making change to set w1 bit for evex

06ecf6a

trying to return EA_4BYTE for INS_vcvttss2usi to make sure that we re…

6b963e6

…ad dword and not qword for float to ulong

jit format

e8f06d9

build-analysis bot mentioned this pull request May 18, 2023

Could not load file or assembly 'Microsoft.CodeAnalysis.NetAnalyzers #84995

Closed

This was referenced May 18, 2023

Infra improvements for Helix #68176

Closed

Methodical_others test JIT/Methodical/Coverage/copy_prop_byref_to_native_int crashing #69832

Open

Long Running Test: Interop/MonoAPI/MonoMono/PInvokeDetach/PInvokeDetach.sh #73040

Closed

Splitting vcvttss2usi to vcvttss2usi32 and vcvttss2usi64. Also adding…

c703c1b

… a special handling for vcvttss2usi64 to make sure we read only dword instead of qword for float to ulong conversion

build-analysis bot mentioned this pull request May 18, 2023

Failed USB connection via port 54050, error 61, in tvOS arm64 Release AllSubsets_Mono #82637

Open

undoing jitformat changes due to merge error

bb4a91e

khushal1996 closed this May 19, 2023

ghost locked as resolved and limited conversation to collaborators Jun 18, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Isolating JITDbl2Ulng helper changes #86175

Isolating JITDbl2Ulng helper changes #86175

khushal1996 commented May 12, 2023 •

edited

Loading

ghost commented May 12, 2023

Isolating JITDbl2Ulng helper changes #86175

Isolating JITDbl2Ulng helper changes #86175

Conversation

khushal1996 commented May 12, 2023 • edited Loading

ghost commented May 12, 2023

khushal1996 commented May 12, 2023 •

edited

Loading